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Furuta, Hiroto 
Horikawa, Yukio 
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(ii) TITLE OF INVENTION: MUTATIONS IN THE DIABETES SUSCEPTIBILITY 
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(iii) NUMBER OF SEQUENCES: 146 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Arnold, White & Durkee 

(B) STREET: P.O. Box 4433 

(C) CITY: Houston 

(D) STATE: Texas 

(E) COUNTRY: USA 

(F) ZIP: 77210 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
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(C) OPERATING SYSTEM: PC-DOS/MS-DOS 
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(A) TELEPHONE: 512/418-3000 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 323 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : modif ied__base-. 

(B) LOCATION: 988 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "N = A, C, G, or T" 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: join ( 24 . . 986 , 990.. 1916) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

CGTGGCCCTG TGGCAGCCGA GCC ATG GTT TCT AAA CTG AGC CAG CTG CAG 

Met Val Ser Lys Leu Ser Gin Leu Gin 
1 5 

ACG GAG CTC CTG GCG GCC CTG CTC GAG TCA GGG CTG AGC AAA GAG GCA 
Thr Glu Leu Leu Ala Ala Leu Leu Glu Ser Gly Leu Ser Lys Glu Ala 
10 15 20 25 

CTG ATC CAG GCA CTG GGT GAG CCG GGG CCC TAC CTC CTG GCT GGA GAA 
Leu lie Gin Ala Leu Gly Glu Pro Gly Pro Tyr Leu Leu Ala Gly Glu 
30 35 40 

GGC CCC CTG GAC AAG GGG GAG TCC TGC GGC GGC GGT CGA GGG GAG CTG 
Gly Pro Leu Asp Lys Gly Glu Ser Cys Gly Gly Gly Arg Gly Glu Leu 
45 50 55 

GCT GAG CTG CCC AAT GGG CTG GGG GAG ACT CGG GGC TCC GAG GAC GAG 
Ala Glu Leu Pro Asn Gly Leu Gly Glu Thr Arg Gly Ser Glu Asp Glu 
60 65 70 

ACG GAC GAC GAT GGG GAA GAC TTC ACG CCA CCC ATC CTC AAA GAG CTG 

Thr Asp Asp Asp Gly Glu Asp Phe Thr Pro Pro He Leu Lys Glu Leu 
75 80 85 

GAG AAC CTC AGC CCT GAG GAG GCG GCC CAC CAG AAA GCC GTG GTG GAG 
Glu Asn Leu Ser Pro Glu Glu Ala Ala His Gin Lys Ala Val Val Glu 
90 95 100 105 

ACC CTT CTG CAG GAG GAC CCG TGG CGT GTG GCG AAG ATG GTC AAG TCC 
Thr Leu Leu Gin Glu Asp Pro Trp Arg Val Ala Lys Met Val Lys Ser 
110 115 120 

TAC CTG CAG CAG CAC AAC ATC CCA CAG CGG GAG GTG GTC GAT ACC ACT 
Tyr Leu Gin Gin His Asn He Pro Gin Arg Glu Val Val Asp Thr Thr 
125 130 135 

GGC CTC AAC CAG TCC CAC CTG TCC CAA CAC CTC AAC AAG GGC ACT CCC 
Gly Leu Asn Gin Ser His Leu Ser Gin His Leu Asn Lys Gly Thr Pro 
140 145 150 

ATG AAG ACG CAG AAG CGG GCC GCC CTG TAC ACC TGG TAC GTC CGC AAG 
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Met Lys Thr Gin Lys Arg Ala Ala Leu Tyr Thr Trp Tyr Val Arg Lys 
155 160 165 

CAG CGA GAG GTG GCG CAG CAG TTC ACC CAT GCA GGG CAG GGA GGG CTG 57 8 

5 Gin Arg Glu Val Ala Gin Gin Phe Thr His Ala Gly Gin Gly Gly Leu 
170 175 180 185 

ATT GAA GAG CCC AC A GGT GAT GAG CTA CCA ACC AAG AAG GGG CGG AGG 626 
lie Glu Glu Pro Thr Gly Asp Glu Leu Pro Thr Lys Lys Gly Arg Ara 
10 190 IS 5 200 

AAC CGT TTC AAG TGG GGC CCA GCA TCC CAG CAG ATC CTG TTC CAG GCC 674 
Asn Arg Phe Lys Trp Gly Pro Ala Ser Gin Gin He Leu Phe Gin Ala 
205 210 215 



15 



;B5 



55 



TAT GAG AGG CAG AAG AAC CCT AGC AAG GAG GAG CGA GAG ACG CTA GTG 722 
Tyr Glu Arg Gin Lys Asn Pro Ser Lys Glu Glu Arg Glu Thr Leu Val 
220 225 230 * 



20 GAG GAG TGC AAT AGG GCG GAA TGC ATC CAG AGA GGG GTG TCC CCA TCA 770 

Glu Glu Cys Asn Arg Ala Glu Cys He Gin Arg Gly Val Ser Pro Ser 
D 235 240 245 

-J CAG GCA CAG GGG CTG GGC TCC AAC CTC GTC ACG GAG GTG CGT GTC TAC 818 

7 25 Gin Ala Gin Gly Leu Gly Ser Asn Leu Val Thr Glu Val Arg Val Tyr 

'[1 250 255 260 265 

* AAC TGG TTT GCC AAC CGG CGC AAA GAA GAA GCC TTC CGG CAC AAG CTG 8 66 

Asn Trp Phe Ala Asn Arg Arg Lys Glu Glu Ala Phe Arg His Lys Leu 
;^0 270 275 280 

^ GCC ATG GAC ACG TAC AGC GGG CCC CCC CCA GGG CCA GGC CCG GGA CCT 914 

s 7 Ala Met Asp Thr Tyr Ser Gly Pro Pro Pro Gly Pro Gly Pro Gly Pro 

285 290 295 



GCG CTG CCC GCT CAC AGC TCC CCT GGC CTG CCT CCA CCT GCC CTC TCC 9 62 

Ala Leu Pro Ala His Ser Ser Pro Gly Leu Pro Pro Pro Ala Leu Ser 
300 305 310 



40 CCC AGT AAG GTC CAC GGT GTG CGC TNT GGA CAG CCT GCG ACC AGT GAG 1010 

Pro Ser Lys Val His Gly Val Arg Gly Gin Pro Ala Thr Ser Glu 
315 320 325 

ACT GCA GAA GTA CCC TCA AGC AGC GGC GGT CCC TTA GTG AC A GTG TCT 1058 
45 Thr Ala Glu Val Pro Ser Ser Ser Gly Gly Pro Leu Val Thr Val Ser 
330 335 340 

ACA CCC CTC CAC CAA GTG TCC CCC ACG GGC CTG GAG CCC AGC CAC AGC 1106 
Thr Pro Leu His Gin Val Ser Pro Thr Gly Leu Glu Pro Ser His Ser 
50 345 350 355 360 

CTG CTG AGT ACA GAA GCC AAG CTG GTC TCA GCA GCT GGG GGC CCC CTC 1154 
Leu Leu Ser Thr Glu Ala Lys Leu Val Ser Ala Ala Gly Gly Pro Leu 
365 370 375 



CCC CCT GTC AGC ACC CTG ACA GCA CTG CAC AGC TTG GAG CAG ACA TCC 1202 
Pro Pro Val Ser Thr Leu Thr Ala Leu His Ser Leu Glu Gin Thr Ser 
380 385 390 



60 CCA GGC CTC AAC CAG CAG CCC CAG AAC CTC ATC ATG GCC TCA CTT CCT 1250 
Pro Gly Leu Asn Gin Gin Pro Gin Asn Leu He Met Ala Ser Leu Pro 
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395 



400 



405 



10 



GGG GTC ATG ACC ATC GGG CCT GGT GAG CCT GCC TCC CTG GGT CCT ACG 1298 
Gly Val Met Thr lie Gly Pro Gly Glu Pro Ala Ser Leu Gly Pro Thr 
410 415 420 

TTC ACC AAC ACA GGT GCC TCC ACC CTG GTC ATC GGC CTG GCC TCC ACG 13 46 

Phe Thr Asn Thr Gly Ala Ser Thr Leu Val lie Gly Leu Ala Ser Thr 
425 430 435 440 

CAG GCA CAG AGT GTG CCG GTC ATC AAC AGC ATG GGC AGC AGC CTG ACC 13 94 

Gin Ala Gin Ser Val Pro Val lie Asn Ser Met Gly Ser Ser Leu Thr 
445 450 455 

15 ACC CTG CAG CCC GTC CAG TTC TCC CAG CCG CTG CAC CCC TCC TAC CAG 1442 

Thr Leu Gin Pro Val Gin Phe Ser Gin Pro Leu His Pro Ser Tyr Gin 
460 465 470 

CAG CCG CTC ATG CCA CCT GTG CAG AGC CAT GTG ACC CAG AGC CCC TTC 1490 
20 Gin Pro Leu Met Pro Pro Val Gin Ser His Val Thr Gin Ser Pro Phe 
475 480 485 

ATG GCC ACC ATG GCT CAG CTG CAG AGC CCC CAC GCC CTC TAC AGC CAC 153 8 

Met Ala Thr Met Ala Gin Leu Gin Ser Pro His Ala Leu Tyr Ser His 
^5 490 495 500 

=F AAG CCC GAG GTG GCC CAG TAC ACC CAC ACG GGC CTG CTC CCG CAG ACT 158 6 

^ Lys Pro Glu Val Ala Gin Tyr Thr His Thr Gly Leu Leu Pro Gin Thr 

?s 1 505 510 515 520 

IP 

ATG CTC ATC ACC GAC ACC ACC AAC CTG AGC GCC CTG GCC AGC CTC ACG 1634 
y Met Leu lie Thr Asp Thr Thr Asn Leu Ser Ala Leu Ala Ser Leu Thr 

!J 525 530 535 

;35 CCC ACC AAG CAG GTC TTC ACC TCA GAC ACT GAG GCC TCC AGT GAG TCC 1682 

" s jj Pro Thr Lys Gin Val Phe Thr Ser Asp Thr Glu Ala Ser Ser Glu Ser 

X 540 545 550 

w> GGG CTT CAC ACG CCG GCA TCT CAG GCC ACC ACC CTC CAC GTC CCC AGC 173 0 

40 Gly Leu His Thr Pro Ala Ser Gin Ala Thr Thr Leu His Val Pro Ser 
555 560 565 

CAG GAC CCT GCC GGC ATC CAG CAC CTG CAG CCG GCC CAC CGG CTC AGC 1778 
Gin Asp Pro Ala Gly lie Gin His Leu Gin Pro Ala His Arg Leu Ser 
45 570 575 580 

GCC AGC CCC ACA GTG TCC TCC AGC AGC CTG GTG CTG TAC CAG AGC TCA 182 6 

Ala Ser Pro Thr Val Ser Ser Ser Ser Leu Val Leu Tyr Gin Ser Ser 
585 590 595 600 

50 

GAC TCC AGC AAT GGC CAG AGC CAC CTG CTG CCA TCC AAC CAC AGC GTC 1874 
Asp Ser Ser Asn Gly Gin Ser His Leu Leu Pro Ser Asn His Ser Val 
605 610 615 

55 ATC GAG ACC TTC ATC TCC ACC CAG ATG GCC TCT TCC TCC CAG 1916 

lie Glu Thr Phe lie Ser Thr Gin Met Ala Ser Ser Ser Gin 
620 625 630 



60 



TAACCACGGC ACCTGGGCCC TGGGGCCTGT ACTGCCTGCT TGGGGGGTGA TGAGGGCAGC 1976 
AGCCAGCCCT GCCTGGAGGA CCTGAGCCTG CCGAGCAACC GTGGCCCTTC CTGGACAGCT 2036 



-201- 
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GTGCCTCGCT CCCCACTCTG CTCTGATGCA TCAGAAAGGG AGGGCTCTGA GGCGCCCCAA 2096 

CCCGTGGAGG CTGCTCGGGG TGCACAGGAG GGGGTCGTGG AGAGCTAGGA GCAAAGCCTG 2156 

TTCATGGCAG ATGTAGGAGG GACTGTCGCT GCTTCGTGGG ATACAGTCTT CTTACTTGGA 2216 

ACTGAAGGGG GCGGCCTATG ACTTGGGCAC CCCCAGCCTG GGCCTATGGA GAGCCCTGGG 2276 

10 ACCGCTACAC CACTCTGGCA GCCACACTTC TCAGGACACA GGCCTGTGTA GCTGTGACCT 23 3 6 

GCTGAGCTCT GAGAGGCCCT GGATCAGCGT GGCCTTGTTC TGTCACCAAT GTACCCACCG 239 6 

GGCCACTCCT TCCTGCCCCA ACTCCTTCCA GCTAGTGACC CACATGCCAT TTGTACTGAC 2456 

CCCATCACCT ACTCACACAG GCATTTCCTG GGTGGCTACT CTGTGCCAGA GCCTGGGGCT 2516 

CTAACTGCCT GAGCCCAGGG AGGCCGAAGC TAACAGGGAA GGCAGGCAGG GCTCTCCTGG 2 57 6 

20 TCTTCCCATC CCCAGCGATT CCCTCTCCCA GGCCCCATGA CCTCCAGCTT TCCTGTATTT 2636 

. CTTCCCAAGA GCATGATGCC TCTGAGGCCA GCCTGGCCTC CTGCCTCTAC TGGGAAGGCT 2696 

I ACTTCGGGGC TGGGAAGTCG TCCTTACTCC TGTGGGAGCC TCGCAACCCG TGCCAAGTCC 2756 

£5 

1 AGGTCCTGGT GGGGCAGCTC CTCTGTCTCG AGCGCCCTGC AGACCCTGCC CTTGTTTGGG 2 816 

I GCAGGAGTAG CTGAGCTCAC AAGGCAGCAA GGCCCGAGCA GCTGAGCAGG GCCGGGGAAC 2 87 6 

SO TGGCCAAGCT GAGGTGCCCA GGAGAAGAAA GAGGTGACCC CAGGGCACAG GAGCTACCTG 293 6 

TGTGGACAGG ACTAACACTC AGAAGCCTGG GTGCCTGGCT GGCTGAGGGC AGTTCGCAGC 299 6 

CACCCTGAGG AGTCTGAGGT CCTGAGCACT GCCAGGAGGG ACAAAGGAGC CTGTGAACCC 3056 

; ; ; AGGACAAGCA TGGTCCCACA TCCCTGGGCC TGCTGCTGAG AACCTGGCCT TCAGTGTACC 3116 

^ GCGTCTACCC TGGGATTCAG GAAAAGGCCT GGGGTGACCC GGCACCCCCT GCAGCTTGTA 317 6 

40 GCCAGCCGGG GCGAGTGGCA CGTTTATTTA ACTTTTAGTA AAGTCAAGGA GAAATGCGGT 323 6 

OG .3238 

45 (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 630 amino acids 

(B) TYPE: amino acid 
50 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Met Val Ser Lys Leu Ser Gin Leu Gin Thr Glu Leu Leu Ala Ala Leu 
15 10 15 



235 



55 



Leu Glu Ser Gly Leu Ser Lys Glu Ala Leu lie Gin Ala Leu Gly Glu 
60 20 25 30 
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Pro Gly Pro Tyr Leu Leu Ala Gly Glu Gly Pro Leu Asp Lys Gly Glu 
35 40 45 

Ser Cys Gly Gly Gly Arg Gly Glu Leu Ala Glu Leu Pro Asn Gly Leu 
50 55 60 

Gly Glu Thr Arg Gly Ser Glu Asp Glu Thr Asp Asp Asp Gly Glu Asp 
6 5 70 75 80 

Phe Thr Pro Pro He Leu Lys Glu Leu Glu Asn Leu Ser Pro Glu Glu 
85 90 95 

Ala Ala His Gin Lys Ala Val Val Glu Thr Leu Leu Gin Glu Asp Pro 
100 105 110 

Trp Arg Val Ala Lys Met Val Lys Ser Tyr Leu Gin Gin His Asn He 
115 120 125 

Pro Gin Arg Glu Val Val Asp Thr Thr Gly Leu Asn Gin Ser His Leu 
130 135 140 

Ser Gin His Leu Asn Lys Gly Thr Pro Met Lys Thr Gin Lys Arg Ala 
145 150 155 160 

Ala Leu Tyr Thr Trp Tyr Val Arg Lys Gin Arg Glu Val Ala Gin Gin 
165 170 175 

Phe Thr His Ala Gly Gin Gly Gly Leu He Glu Glu Pro Thr Gly Asp 
180 185 190 

Glu Leu Pro Thr Lys Lys Gly Arg Arg Asn Arg Phe Lys Trp Gly Pro 
195 200 205 

Ala Ser Gin Gin He Leu Phe Gin Ala Tyr Glu Arg Gin Lys Asn Pro 
210 215 220 

Ser Lys Glu Glu Arg Glu Thr Leu Val Glu Glu Cys Asn Arg Ala Glu 
225 230 235 240 

Cys He Gin Arg Gly Val Ser Pro Ser Gin Ala Gin Gly Leu Gly Ser 
245 250 255 

Asn Leu Val Thr Glu Val Arg Val Tyr Asn Trp Phe Ala Asn Arg Arg 
260 265 270 

Lys Glu Glu Ala Phe Arg His Lys Leu Ala Met Asp Thr Tyr Ser Gly 
275 280 285 

Pro Pro Pro Gly Pro Gly Pro Gly Pro Ala Leu Pro Ala His Ser Ser 
290 295 300 

Pro Gly Leu Pro Pro Pro Ala Leu Ser Pro Ser Lys Val His Gly Val 
305 310 315 320 

Arg Gly Gin Pro Ala Thr Ser Glu Thr Ala Glu Val Pro Ser Ser Ser 
325 330 335 



Gly Gly Pro Leu Val Thr Val Ser Thr Pro Leu His Gin Val Ser Pro 
340 345 350 

Thr Gly Leu Glu Pro Ser His Ser Leu Leu Ser Thr Glu Ala Lys Leu 
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Val Ser Ala Ala Gly Gly Pro Leu Pro Pro Val Ser Thr Leu Thr Ala 
370 375 380 

Leu His Ser Leu Glu Gin Thr Ser Pro Gly Leu Asn Gin Gin Pro Gin 
385 390 395 400 

Asn Leu He Met Ala Ser Leu Pro Gly Val Met Thr He Gly Pro Gly 
405 410 415 

Glu Pro Ala Ser Leu Gly Pro Thr Phe Thr Asn Thr Gly Ala Ser Thr 
420 425 430 

Leu Val He Gly Leu Ala Ser Thr Gin Ala Gin Ser Val Pro Val He 
435 440 445 

Asn Ser Met Gly Ser Ser Leu Thr Thr Leu Gin Pro Val Gin Phe Ser 
450 455 460 

Gin Pro Leu His Pro Ser Tyr Gin Gin Pro Leu Met Pro Pro Val Gin 
46 5 470 475 480 

Ser His Val Thr Gin Ser Pro Phe Met Ala Thr Met Ala Gin Leu Gin 
485 490 495 

Ser Pro His Ala Leu Tyr Ser His Lys Pro Glu Val Ala Gin Tyr Thr 
500 505 510 

His Thr Gly Leu Leu Pro Gin Thr Met Leu He Thr Asp Thr Thr Asn 
515 520 525 

Leu Ser Ala Leu Ala Ser Leu Thr Pro Thr Lys Gin Val Phe Thr Ser 
530 535 540 

Asp Thr Glu Ala Ser Ser Glu Ser Gly Leu His Thr Pro Ala Ser Gin 
545 550 555 560 

Ala Thr Thr Leu His Val Pro Ser Gin Asp Pro Ala Gly He Gin His 
565 570 575 

Leu Gin Pro Ala His Arg Leu Ser Ala Ser Pro Thr Val Ser Ser Ser 
580 585 590 

Ser Leu Val Leu Tyr Gin Ser Ser Asp Ser Ser Asn Gly Gin Ser His 
595 600 605 

Leu Leu Pro Ser Asn His Ser Val He Glu Thr Phe He Ser Thr Gin 
610 615 620 



Met Ala Ser Ser Ser Gin 
625 630 
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(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3238 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: modi f ied_base 

(B) LOCATION: 988 

(D) OTHER INFORMATION: /mod_base= OTHER 
/note= "N = A, C, G, or T M 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: join (24 .. 986 , 990.. 1916) 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

CGTGGCCCTG TGGCAGCCGA GCC ATG GTT TCT AAA CTG AGC CAG CTG CAG 

Met Val Ser Lys Leu Ser Gin Leu Gin 
1 5 

ACG GAG CTC CTG GCG GCC CTG CTC GAG TCA GGG CTG AGC AAA GAG GCA 
Thr Glu Leu Leu Ala Ala Leu Leu Glu Ser Gly Leu Ser Lys Glu Ala 
10 15 20 25 

CTG ATC CAG GCA CTG GGT GAG CCG GGG CCC TAC CTC CTG GCT GGA GAA 
Leu lie Gin Ala Leu Gly Glu Pro Gly Pro Tyr Leu Leu Ala Gly Glu 
30 35 40 

GGC CCC CTG GAC AAG GGG GAG TCC TGC GGC GGC GGT CGA GGG GAG CTG 
Gly Pro Leu Asp Lys Gly Glu Ser Cys Gly Gly Gly Arg Gly Glu Leu 
45 50 55 

GCT GAG CTG CCC AAT GGG CTG GGG GAG ACT CGG GGC TCC GAG GAC GAG 

Ala Glu Leu Pro Asn Gly Leu Gly Glu Thr Arg Gly Ser Glu Asp Glu 
60 65 70 

ACG GAC GAC GAT GGG GAA GAC TTC ACG CCA CCC ATC CTC AAA GAG CTG 
Thr Asp Asp Asp Gly Glu Asp Phe Thr Pro Pro lie Leu Lys Glu Leu 
75 80 85 

GAG AAC CTC AGC CCT GAG GAG GCG GCC CAC CAG AAA GCC GTG GTG GAG 
Glu Asn Leu Ser Pro Glu Glu Ala Ala His Gin Lys Ala Val Val Glu 
90 95 100 105 

ACC CTT CTG CAG GAG GAC CCG TGG CGT GTG GCG AAG ATG GTC AAG TCC 
Thr Leu Leu Gin Glu Asp Pro Trp Arg Val Ala Lys Met Val Lys Ser 
110 115 120 

TAC CTG CAG CAG CAC AAC ATC CCA CAG CAG GAG GTG GTC GAT ACC ACT 
Tyr Leu Gin Gin His Asn lie Pro Gin Gin Glu Val Val Asp Thr Thr 
125 130 135 

GGC CTC AAC CAG TCC CAC CTG TCC CAA CAC CTC AAC AAG GGC ACT CCC 
Gly Leu Asn Gin Ser His Leu Ser Gin His Leu Asn Lys Gly Thr Pro 
140 145 150 
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ATG AAG ACG CAG AAG CGG GCC GCC CTG TAC ACC TGG TAC GTC CGC AAG 53 0 

Met Lys Thr Gin Lys Arg Ala Ala Leu Tyr Thr Trp Tyr Val Arg Lys 

155 160 165 

5 

CAG CGA GAG GTG GCG CAG CAG TTC ACC CAT GCA GGG CAG GGA GGG CTG 57 8 

Gin Arg Glu Val Ala Gin Gin Phe Thr His Ala Gly Gin Gly Gly Leu 
170 175 180 185 

10 ATT GAA GAG CCC AC A GGT GAT GAG CTA CCA ACC AAG AAG GGG CGG AGG 62 6 

lie Glu Glu Pro Thr Gly Asp Glu Leu Pro Thr Lys Lys Gly Arg Arg 
190 195 200 

AAC CGT TTC AAG TGG GGC CCA GCA TCC CAG CAG ATC CTG TTC CAG GCC 674 
15 Asn Arg Phe Lys Trp Gly Pro Ala Ser Gin Gin lie Leu Phe Gin Ala 

205 210 215 

TAT GAG AGG CAG AAG AAC CCT AGC AAG GAG GAG CGA GAG ACG CTA GTG 722 
Tyr Glu Arg Gin Lys Asn Pro Ser Lys Glu Glu Arg Glu Thr Leu Val 
20 220 225 230 

Q GAG GAG TGC AAT AGG GCG GAA TGC ATC CAG AGA GGG GTG TCC CCA TCA 770 

;|j Glu Glu Cys Asn Arg Ala Glu Cys He Gin Arg Gly Val Ser Pro Ser 

W 235 240 245 

if 25 

'% CAG GCA CAG GGG CTG GGC TCC AAC CTC GTC ACG GAG GTG CGT GTC TAC 818 

Gin Ala Gin Gly Leu Gly Ser Asn Leu Val Thr Glu Val Arg Val Tyr 
250 255 260 265 

rfBO AAC TGG TTT GCC AAC CGG CGC AAA GAA GAA GCC TTC CGG CAC AAG CTG 866 

Asn Trp Phe Ala Asn Arg Arg Lys Glu Glu Ala Phe Arg His Lys Leu 
270 275 280 

|;f GCC ATG GAC ACG TAC AGC GGG CCC CCC CCA GGG CCA GGC CCG GGA CCT 914 

l .'J35 Ala Met Asp Thr Tyr Ser Gly Pro Pro Pro Gly Pro Gly Pro Gly Pro 

UJ 285 290 295 

tl GCG CTG CCC GCT CAC AGC TCC CCT GGC CTG CCT CCA CCT GCC CTC TCC 962 

r " Ala Leu Pro Ala His Ser Ser Pro Gly Leu Pro Pro Pro Ala Leu Ser 

40 300 305 310 

CCC AGT AAG GTC CAC GGT GTG CGC TNT GGA CAG CCT GCG ACC AGT GAG .1010 
Pro Ser Lys Val His Gly Val Arg Gly Gin Pro Ala Thr Ser Glu 

315 320 325 



45 

ACT GCA GAA GTA CCC TCA AGC AGC GGC GGT CCC TTA GTG ACA GTG TCT 1058 
Thr Ala Glu Val Pro Ser Ser Ser Gly Gly Pro Leu Val Thr Val Ser 
330 335 340 

50 ACA CCC CTC CAC CAA GTG TCC CCC ACG GGC CTG GAG CCC AGC CAC AGC 1106 

Thr Pro Leu His Gin Val Ser Pro Thr Gly Leu Glu Pro Ser His Ser 
345 350 355 360 

CTG CTG AGT ACA GAA GCC AAG CTG GTC TCA GCA GCT GGG GGC CCC CTC 1154 
55 Leu Leu Ser Thr Glu Ala Lys Leu Val Ser Ala Ala Gly Gly Pro Leu 

365 370 375 

CCC CCT GTC AGC ACC CTG ACA GCA CTG CAC AGC TTG GAG CAG ACA TCC 1202 
Pro Pro Val Ser Thr Leu Thr Ala Leu His Ser Leu Glu Gin Thr Ser 
60 380 385 390 
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CCA GGC CTC AAC CAG CAG CCC CAG AAC CTC ATC ATG GCC TCA CTT CCT 1250 
Pro Gly Leu Asn Gin Gin Pro Gin Asn Leu lie Met Ala Ser Leu Pro 
395 400 405 

5 GGG GTC ATG ACC ATC GGG CCT GGT GAG CCT GCC TCC CTG GGT CCT ACG 1298 
Gly Val Met Thr lie Gly Pro Gly Glu Pro Ala Ser Leu Gly Pro Thr 
410 415 420 

TTC ACC AAC ACA GGT GCC TCC ACC CTG GTC ATC GGC CTG GCC TCC ACG 1346 
10 Phe Thr Asn Thr Gly Ala Ser Thr Leu Val He Gly Leu Ala Ser Thr 
425 430 435 440 

CAG GCA CAG AGT GTG CCG GTC ATC AAC AGC ATG GGC AGC AGC CTG ACC 1394 
Gin Ala Gin Ser Val Pro Val He Asn Ser Met Gly Ser Ser Leu Thr 
15 445 450 455 

ACC CTG CAG CCC GTC CAG TTC TCC CAG CCG CTG CAC CCC TCC TAC CAG 1442 
Thr Leu Gin Pro Val Gin Phe Ser Gin Pro Leu His Pro Ser Tyr Gin 
460 465 470 



20 



40 



60 



CAG CCG CTC ATG CCA CCT GTG CAG AGC CAT GTG ACC CAG AGC CCC TTC 1490 
Gin Pro Leu Met Pro Pro Val Gin Ser His Val Thr Gin Ser Pro Phe 
475 480 485 



j|5 ATG GCC ACC ATG GCT CAG CTG CAG AGC CCC CAC GCC CTC TAC AGC CAC 153 8 

Met Ala Thr Met Ala Gin Leu Gin Ser Pro His Ala Leu Tyr Ser His 
f 490 495 500 

□ AAG CCC GAG GTG GCC CAG TAC ACC CAC ACG GGC CTG CTC CCG CAG ACT 1586 

f|0 Lys Pro Glu Val Ala Gin Tyr Thr His Thr Gly Leu Leu Pro Gin Thr 
; 5, 505 510 515 520 

; 3 ATG CTC ATC ACC GAC ACC ACC AAC CTG AGC GCC CTG GCC AGC CTC ACG 1634 

^ Met Leu He Thr Asp Thr Thr Asn Leu Ser Ala Leu Ala Ser Leu Thr 

;:35 525 530 535 

=S CCC ACC AAG CAG GTC TTC ACC TCA GAC ACT GAG GCC TCC AGT GAG TCC 1682 

M Pro Thr Lys Gin Val Phe Thr Ser Asp Thr Glu Ala Ser Ser Glu Ser 

540 545 550 



GGG CTT CAC ACG CCG GCA TCT CAG GCC ACC ACC CTC CAC GTC CCC AGC 1730 
Gly Leu His Thr Pro Ala Ser Gin Ala Thr Thr Leu His Val Pro Ser 
555 560 565 



45 CAG GAC CCT GCC GGC ATC CAG CAC CTG CAG CCG GCC CAC CGG CTC AGC 177 8 

Gin Asp Pro Ala Gly He Gin His Leu Gin Pro Ala His Arg Leu Ser 
570 575 580 

GCC AGC CCC ACA GTG TCC TCC AGC AGC CTG GTG CTG TAC CAG AGC TCA 182 6 

50 Ala Ser Pro Thr Val Ser Ser Ser Ser Leu Val Leu Tyr Gin Ser Ser 
585 590 595 600 

GAC TCC AGC AAT GGC CAG AGC CAC CTG CTG CCA TCC AAC CAC AGC GTC 1874 
Asp Ser Ser Asn Gly Gin Ser His Leu Leu Pro Ser Asn His Ser Val 
55 605 610 615 

ATC GAG ACC TTC ATC TCC ACC CAG ATG GCC TCT TCC TCC CAG 1916 
He Glu Thr Phe He Ser Thr Gin Met Ala Ser Ser Ser Gin 
620 625 630 



TAACCACGGC ACCTGGGCCC TGGGGCCTGT ACTGCCTGCT TGGGGGGTGA TGAGGGCAGC 1976 
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AGCCAGCCCT GCCTGGAGGA CCTGAGCCTG CCGAGCAACC GTGGCCCTTC CTGGACAGCT 2 03 6 

GTGCCTCGCT CCCCACTCTG CTCTGATGCA TCAGAAAGGG AGGGCTCTGA GGCGCCCCAA 2096 

5 

CCCGTGGAGG CTGCTCGGGG TGCACAGGAG GGGGTCGTGG AGAGCTAGGA GCAAAGCCTG 2156 

TTCATGGCAG ATGTAGGAGG GACTGTCGCT GCTTCGTGGG ATACAGTCTT CTTACTTGGA 2216 

10 ACTGAAGGGG GCGGCCTATG ACTTGGGCAC CCCCAGCCTG GGCCTATGGA GAGCCCTGGG 2276 

ACCGCTACAC CACTCTGGCA GCCACACTTC TCAGGACACA GGCCTGTGTA GCTGTGACCT 233 6 

GCTGAGCTCT GAGAGGCCCT GGATCAGCGT GGCCTTGTTC TGTCACCAAT GTACCCACCG 239 6 

15 

GGCCACTCCT TCCTGCCCCA ACTCCTTCCA GCTAGTGACC CACATGCCAT TTGTACTGAC 2 456 

CCCATCACCT ACTCACACAG GCATTTCCTG GGTGGCTACT CTGTGCCAGA GCCTGGGGCT 2516 

20 CTAACTGCCT GAGCCCAGGG AGGCCGAAGC TAACAGGGAA GGCAGGCAGG GCTCTCCTGG 2 57 6 

=n TCTTCCCATC CCCAGCGATT CCCTCTCCCA GGCCCCATGA CCTCCAGCTT TCCTGTATTT 2 63 6 

,y CTTCCCAAGA GCATGATGCC TCTGAGGCCA GCCTGGCCTC CTGCCTCTAC TGGGAAGGCT 2696 

M ACTTCGGGGC TGGGAAGTCG TCCTTACTCC TGTGGGAGCC TCGCAACCCG TGCCAAGTCC 2756 

: 1 AGGTCCTGGT GGGGCAGCTC CTCTGTCTCG AGCGCCCTGC AGACCCTGCC CTTGTTTGGG 2816 

|jb GCAGGAGTAG CTGAGCTCAC AAGGCAGCAA GGCCCGAGCA GCTGAGCAGG GCCGGGGAAC 287 6 

s _ TGGCCAAGCT GAGGTGCCCA GGAGAAGAAA GAGGTGACCC CAGGGCACAG GAGCTACCTG 2936 

•«& TGTGGACAGG ACTAACACTC AGAAGCCTGG GTGCCTGGCT GGCTGAGGGC AGTTCGCAGC 2996 

*;"s CACCCTGAGG AGTCTGAGGT CCTGAGCACT GCCAGGAGGG ACAAAGGAGC CTGTGAACCC 3056 

3 AGGACAAGCA TGGTCCCACA TCCCTGGGCC TGCTGCTGAG AACCTGGCCT TCAGTGTACC 3116 

40 GCGTCTACCC TGGGATTCAG GAAAAGGCCT GGGGTGACCC GGCACCCCCT GCAGCTTGTA 3176 

GCCAGCCGGG GCGAGTGGCA CGTTTATTTA ACTTTTAGTA AAGTCAAGGA GAAATGCGGT .3236 

GG 3238 

45 

(2) INFORMATION FOR SEQ ID NO : 4 : 

<i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 63 0 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

55 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Val Ser Lys Leu Ser Gin Leu Gin Thr Glu Leu Leu Ala Ala Leu 
15 10 15 

60 

Leu Glu Ser Gly Leu Ser Lys Glu Ala Leu lie Gin Ala Leu Gly Glu 
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20 25 30 

Pro Gly Pro Tyr Leu Leu Ala Gly Glu Gly Pro Leu Asp Lys Gly Glu 
35 40 45 

Ser Cys Gly Gly Gly Arg Gly Glu Leu Ala Glu Leu Pro Asn Gly Leu 
50 55 60 

Gly Glu Thr Arg Gly Ser Glu Asp Glu Thr Asp Asp Asp Gly Glu Asp 
65 70 _ 75 80 

Phe Thr Pro Pro lie Leu Lys Glu Leu Glu Asn Leu Ser Pro Glu Glu 
85 90 95 

Ala Ala His Gin Lys Ala Val Val Glu Thr Leu Leu Gin Glu Asp Pro 
100 105 110 

Trp Arg Val Ala Lys Met Val Lys Ser Tyr Leu Gin Gin His Asn lie 
115 120 125 

Pro Gin Gin Glu Val Val Asp Thr Thr Gly Leu Asn Gin Ser His Leu 
130 135 140 

Ser Gin His Leu Asn Lys Gly Thr Pro Met Lys Thr Gin Lys Arg Ala 
145 150 155 160 

Ala Leu Tyr Thr Trp Tyr Val Arg Lys Gin Arg Glu Val Ala Gin Gin 
165 170 175 

Phe Thr His Ala Gly Gin Gly Gly Leu lie Glu Glu Pro Thr Gly Asp 
180 185 190 

Glu Leu Pro Thr Lys Lys Gly Arg Arg Asn Arg Phe Lys Trp Gly Pro 
195 200 205 

Ala Ser Gin Gin He Leu Phe Gin Ala Tyr Glu Arg Gin Lys Asn Pro 
210 215 220 

Ser Lys Glu Glu Arg Glu Thr Leu Val Glu Glu Cys Asn Arg Ala Glu 
225 230 235 240 

Cys He Gin Arg Gly Val Ser Pro Ser Gin Ala Gin Gly Leu Gly Ser 
245 250 255 

Asn Leu Val Thr Glu Val Arg Val Tyr Asn Trp Phe Ala Asn Arg Arg 
260 265 270 

Lys Glu Glu Ala Phe Arg His Lys Leu Ala Met Asp Thr Tyr Ser Gly 
275 280 285 

Pro Pro Pro Gly Pro Gly Pro Gly Pro Ala Leu Pro Ala His Ser Ser 
290 295 300 

Pro Gly Leu Pro Pro Pro Ala Leu Ser Pro Ser Lys Val His Gly Val 
305 310 315 320 

Arg Gly Gin Pro Ala Thr Ser Glu Thr Ala Glu Val Pro Ser Ser Ser 
325 330 335 

Gly Gly Pro Leu Val Thr Val Ser Thr Pro Leu His Gin Val Ser Pro 
340 345 350 
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Thr Gly Leu Glu Pro Ser His Ser Leu Leu Ser Thr Glu Ala Lys Leu 
355 360 365 

Val Ser Ala Ala Gly Gly Pro Leu Pro Pro Val Ser Thr Leu Thr Ala 
370 375 380 

Leu His Ser Leu Glu Gin Thr Ser Pro Gly Leu Asn Gin Gin Pro Gin 
385 390 395 400 

Asn Leu lie Met Ala Ser Leu Pro Gly Val Met Thr lie Gly Pro Gly 
405 410 415 

Glu Pro Ala Ser Leu Gly Pro Thr Phe Thr Asn Thr Gly Ala Ser Thr 
420 425 430 

Leu Val He Gly Leu Ala Ser Thr Gin Ala Gin Ser Val Pro Val He 
435 440 445 

Asn Ser Met Gly Ser Ser Leu Thr Thr Leu Gin Pro Val Gin Phe Ser 
450 455 460 

Gin Pro Leu His Pro Ser Tyr Gin Gin Pro Leu Met Pro Pro Val Gin 
465 470 475 480 

Ser His Val Thr Gin Ser Pro Phe Met Ala Thr Met Ala Gin Leu Gin 
485 490 495 

Ser Pro His Ala Leu Tyr Ser His Lys Pro Glu Val Ala Gin Tyr Thr 
500 505 510 

His Thr Gly Leu Leu Pro Gin Thr Met Leu He Thr Asp Thr Thr Asn 
515 520 525 

Leu Ser Ala Leu Ala Ser Leu Thr Pro Thr Lys Gin Val Phe Thr Ser 
530 535 540 

Asp Thr Glu Ala Ser Ser Glu Ser Gly Leu His Thr Pro Ala Ser Gin 
545 550 555 560 

Ala Thr Thr Leu His Val Pro Ser Gin Asp Pro Ala Gly He Gin His 
565 570 575 

Leu Gin Pro Ala His Arg Leu Ser Ala Ser Pro Thr Val Ser Ser Ser 
580 585 590 

Ser Leu Val Leu Tyr Gin Ser Ser Asp Ser Ser Asn Gly Gin Ser His 
595 600 605 

Leu Leu Pro Ser Asn His Ser Val He Glu Thr Phe He Ser Thr Gin 
610 615 620 

Met Ala Ser Ser Ser Gin 
625 630 
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(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3239 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME / KEY : modif ied_base 

(B) LOCATION: 9 89 

(D) OTHER INFORMATION: /mod„base= OTHER 
/note= "N = A, C, G, or T" 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 24.. 965 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

CGTGGCCCTG TGGCAGCCGA GCC ATG GTT TCT AAA CTG AGC CAG CTG CAG 

Met Val Ser Lys Leu Ser Gin Leu Gin 
1 5 

ACG GAG CTC CTG GCG GCC CTG CTC GAG TCA GGG CTG AGC AAA GAG GCA 
Thr Glu Leu Leu Ala Ala Leu Leu Glu Ser Gly Leu Ser Lys Glu Ala 
10 15 20 25 

CTG ATC CAG GCA CTG GGT GAG CCG GGG CCC TAC CTC CTG GCT GGA GAA 
Leu lie Gin Ala Leu Gly Glu Pro Gly Pro Tyr Leu Leu Ala Gly Glu 
30 35 40 

GGC CCC CTG GAC AAG GGG GAG TCC TGC GGC GGC GGT CGA GGG GAG CTG 
Gly Pro Leu Asp Lys Gly Glu Ser Cys Gly Gly Gly Arg Gly Glu Leu 
45 50 55 

GCT GAG CTG CCC AAT GGG CTG GGG GAG ACT CGG GGC TCC GAG GAC GAG 
Ala Glu Leu Pro Asn Gly Leu Gly Glu Thr Arg Gly Ser Glu Asp Glu 
60 65 70 

ACG GAC GAC GAT GGG GAA GAC TTC ACG CCA CCC ATC CTC AAA GAG CTG 
Thr Asp Asp Asp Gly Glu Asp Phe Thr Pro Pro lie Leu Lys Glu Leu 
75 80 85 

GAG AAC CTC AGC CCT GAG GAG GCG GCC CAC CAG AAA GCC GTG GTG GAG 
Glu Asn Leu Ser Pro Glu Glu Ala Ala His Gin Lys Ala Val Val Glu 
90 95 100 105 

ACC CTT CTG CAG GAG GAC CCG TGG CGT GTG GCG AAG ATG GTC AAG TCC 
Thr Leu Leu Gin Glu Asp Pro Trp Arg Val Ala Lys Met Val Lys Ser 
110 115 120 

TAC CTG CAG CAG CAC AAC ATC CCA CAG CGG GAG GTG GTC GAT ACC ACT 
Tyr Leu Gin Gin His Asn He Pro Gin Arg Glu Val Val Asp Thr Thr 
125 130 135 

GGC CTC AAC CAG TCC CAC CTG TCC CAA CAC CTC AAC AAG GGC ACT CCC 
Gly Leu Asn Gin Ser His Leu Ser Gin His Leu Asn Lys Gly Thr Pro 
140 145 150 
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ATG AAG ACG CAG AAG CGG GCC GCC CTG TAC ACC TGG TAC GTC CGC AAG , 530 

Met Lys Thr Gin Lys Arg Ala Ala Leu Tyr Thr Trp Tyr Val Arg Lys 
155 160 165 

5 

CAG CGA GAG GTG GCG CAG CAG TTC ACC CAT GCA GGG CAG GGA GGG CTG 57 8 

Gin Arg Glu Val Ala Gin Gin Phe Thr His Ala Gly Gin Gly Gly Leu 
170 175 180 185 

10 ATT GAA GAG CCC ACA GGT GAT GAG CTA CCA ACC AAG AAG GGG CGG AGG 626 
lie Glu Glu Pro Thr Gly Asp Glu Leu Pro Thr Lys Lys Gly Arg Arg 
190 195 200 

AAC CGT TTC AAG TGG GGC CCA GCA TCC CAG CAG ATC CTG TTC CAG GCC 674 
15 Asn Arg Phe Lys Trp Gly Pro Ala Ser Gin Gin lie Leu Phe Gin Ala 
205 210 215 

TAT GAG AGG CAG AAG AAC CCT AGC AAG GAG GAG CGA GAG ACG CTA GTG 722 
Tyr Glu Arg Gin Lys Asn Pro Ser Lys Glu Glu Arg Glu Thr Leu Val 
20 220 225 230 

Q GAG GAG TGC AAT AGG GCG GAA TGC ATC CAG AGA GGG GTG TCC CCA TCA 770 

;J3 Glu Glu Cys Asn Arg Ala Glu Cys lie Gin Arg Gly Val Ser Pro Ser 

G 235 240 245 

CAG GCA CAG GGG CTG GGC TCC AAC CTC GTC ACG GAG GTG CGT GTC TAC 818 
:; p Gin Ala Gin Gly Leu Gly Ser Asn Leu Val Thr Glu Val Arg Val Tyr 

M; 250 255 260 265 

ff30 AAC TGG TTT GCC AAC CGG CGC AAA GAA GAA GCC TTC CGG CAC AAG CTG 86 6 

Asn Trp Phe Ala Asn Arg Arg Lys Glu Glu Ala Phe Arg His Lys Leu 
L 270 275 280 

i 5 

H; GCC ATG GAC ACG TAC AGC GGG CCC CCC CCC AGG GCC AGG CCC GGG ACC 914 

Q35 Ala Met Asp Thr Tyr Ser Gly Pro Pro Pro Arg Ala Arg Pro Gly Thr 

LJ 285 290 295 

I;;: TGC GCT GCC CGC TCA CAG CTC CCC TGG CCT GCC TCC ACC TGC CCT CTC 962 

r ~~ Cys Ala Ala Arg Ser Gin Leu Pro Trp Pro Ala Ser Thr Cys Pro Leu 

40 300 305 310 

CCC CAGTAAGGTC CACGGTGTGC GCTNTGGACA GCCTGCGACC AGTGAGACTG .1015 
Pro 

45 

CAGAAGTACC CTCAAGCAGC GGCGGTCCCT TAGTGACAGT GTCTACACCC CTCCACCAAG 1075 

TGTCCCCCAC GGGCCTGGAG CCCAGCCACA GCCTGCTGAG TACAGAAGCC AAGCTGGTCT 113 5 

50 CAGCAGCTGG GGGCCCCCTC CCCCCTGTCA GCACCCTGAC AGCACTGCAC AGCTTGGAGC 1195 

AGACATCCCC AGGCCTCAAC CAGCAGCCCC AGAACCTCAT CATGGCCTCA CTTCCTGGGG 1255 

TCATGACCAT CGGGCCTGGT GAGCCTGCCT CCCTGGGTCC TACGTTCACC AACACAGGTG 1315 

CCTCCACCCT GGTCATCGGC CTGGCCTCCA CGCAGGCACA GAGTGTGCCG GTCATCAACA 1375 

GCATGGGCAG CAGCCTGACC ACCCTGCAGC CCGTCCAGTT CTCCCAGCCG CTGCACCCCT 1435 

60 CCTACCAGCA GCCGCTCATG CCACCTGTGC AGAGCCATGT GACCCAGAGC CCCTTCATGG 1495 



55 
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CCACCATGGC 


TCAGCTGCAG 


AGCCCCCACG 


CCCTCTACAG 


CCACAAGCCC 


GAGGTGGCCC 


1555 




AGTACACCCA 


CACGGGCCTG 


CTCCCGCAGA 


CTATGCTCAT 


CACCGACACC 


ACCAACCTGA 


1615 


5 


GCGCCCTGGC 


CAGCCTCACG 


CCCACCAAGC 


AGGTCTTCAC 


CTCAGACACT 


GAGGCCTCCA 


1675 




GTGAGTCCGG 


GCTTCACACG 


CCGGCATCTC 


AGGCCACCAC 


CCTCCACGTC 


CCCAGCCAGG 


1735 


lu 


ACCCTGCCGG 


CATCCAGCAC 


CTGCAGCCGG 


CCCACCGGCT 


CAGCGCCAGC 


CCCACAGTGT 


1795 


CCTCCAGCAG 


CCTGGTGCTG 


TACCAGAGCT 


CAGACTCCAG 


CAATGGCCAG 


AGCCACCTGC 


1855 




TGCCATCCAA 


CCACAGCGTC 


ATCGAGACCT 


TCATCTCCAC 


CCAGATGGCC 


TCTTCCTCCC 


1915 


15 


AGTAACCACG 


GCACCTGGGC 


CCTGGGGCCT 


GTACTGCCTG 


CTTGGGGGGT 


GATGAGGGCA 


1975 




GCAGCCAGCC 


CTGCCTGGAG 


GACCTGAGCC 


TGCCGAGCAA 


CCGTGGCCCT 


TCCTGGACAG 


2035 


on 


CTGTGCCTCG 


CTCCCCACTC 


TGCTCTGATG 


CATCAGAAAG 


GGAGGGCTCT 


GAGGCGCCCC 


2095 


AACCCGTGGA 


GGCTGCTCGG 


GGTGCACAGG 


AGGGGGTCGT 


GGAGAGCTAG 


GAGCAAAGCC 


2155 




TGTTCATGGC 


AGATGTAGGA 


GGGACTGTCG 


CTGCTTCGTG 


GGATACAGTC 


TTCTTACTTG 


2215 




GAACTGAAGG 


GGGCGGCCTA 


TGACTTGGGC 


ACCCCCAGCC 


TGGGCCTATG 


GAGAGCCCTG 


2275 


"f" 


GGACCGCTAC 


ACCACTCTGG 


CAGCCACACT 


TCTCAGGACA 


CAGGCCTGTG 


TAGCTGTGAC 


2335 




CTGCTGAGCT 


CTGAGAGGCC 


CTGGATCAGC 


GTGGCCTTGT 


TCTGTCACCA 


ATGTACCCAC 


2395 




CGGGCCACTC 


CTTCCTGCCC 


CAACTCCTTC 


CAGCTAGTGA 


CCCACATGCC 


ATTTGTACTG 


2455 


l.,„Ji- 


ACCCCATCAC 


CTACTCACAC 


AGGCATTTCC 


TGGGTGGCTA 


CTCTGTGCCA 


GAGCCTGGGG 


2515 




CTCTAACTGC 


CTGAGCCCAG 


GGAGGCCG&A 


GCTAACAGGG 


AAGGCAGGCA 


GGGCTCTCCT 


2575 


w 


GGTCTTCCCA 


TCCCCAGCGA 


TTCCCTCTCC 


CAGGCCCCAT 


GACCTCCAGC 


TTTCCTGTAT 


2635 


/I A 

4U 


TTCTTCCCAA 
CTACTTCGGG 


GAGCATGATG 
GCTGGGAAGT 


CCTCTGAGGC 
CGTCCTTACT 


CAGCCTGGCC 
CCTGTGGGAG 


TCCTGCCTCT 
CCTCGCAACC 


ACTGGGAAGG 
CGTGCCAAGT 


2695 
2755 




CCAGGTCCTG 


GTGGGGCAGC 


TCCTCTGTCT 


CGAGCGCCCT 


GCAGACCCTG 


CCCTTGTTTG 


* 

2815 


45 


GGGCAGGAGT 


AGCTGAGCTC 


ACAAGGCAGC 


AAGGCCCGAG 


CAGCTGAGCA 


GGGCCGGGGA 


2875 




ACTGGCCAAG 


CTGAGGTGCC 


CAGGAGAAGA 


AAGAGGTGAC 


CCCAGGGCAC 


AGGAGCTACC 


2935 




TGTGTGGACA 


GGACTAACAC 


TCAGAAGCCT 


GGGTGCCTGG 


CTGGCTGAGG 


GCAGTTCGCA 


2995 


GCCACCCTGA 


GGAGTCTGAG 


GTCCTGAGCA 


CTGCCAGGAG 


GGACAAAGGA 


GCCTGTGAAC 


3055 




CCAGGACAAG 


CATGGTCCCA 


CATCCCTGGG 


CCTGCTGCTG 


AGAACCTGGC 


CTTCAGTGTA 


3115 


55 


CCGCGTCTAC 


CCTGGGATTC 


AGGAAAAGGC 


CTGGGGTGAC 


CCGGCACCCC 


CTGCAGCTTG 


3175 




TAGCCAGCCG 


GGGCGAGTGG 


CACGTTTATT 


TAACTTTTAG 


TAAAGTCAAG 


GAGAAATGCG 


3235 




GTGA 












3239 



60 
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(2) INFORMATION FOR SEQ ID NO : 6 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 314 amino acids 
5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Val Ser Lys Leu Ser Gin Leu Gin Thr Glu Leu Leu Ala Ala Leu 
15 10 15 

15 Leu Glu Ser Gly Leu Ser Lys Glu Ala Leu lie Gin Ala Leu Gly Glu 

20 25 30 

Pro Gly Pro Tyr Leu Leu Ala Gly Glu Gly Pro Leu Asp Lys Gly Glu 
35 40 45 

Ser Cys Gly Gly Gly Arg Gly Glu Leu Ala Glu Leu Pro Asn Gly Leu 
50 55 60 



20 



Gly Glu Thr Arg Gly Ser Glu Asp Glu Thr Asp Asp Asp Gly Glu Asp 

25 65 70 75 80 

Phe Thr Pro Pro lie Leu Lys Glu Leu Glu Asn Leu Ser Pro Glu Glu 

85 90 95 

30 Ala Ala His Gin Lys Ala Val Val Glu Thr Leu Leu Gin Glu Asp Pro 

100 105 110 



35 



50 



Trp Arg Val Ala Lys Met Val Lys Ser Tyr Leu Gin Gin His Asn lie 

115 120 125 

Pro Gin Arg Glu Val Val Asp Thr Thr Gly Leu Asn Gin Ser His Leu 

130 135 140 



Ser Gin His Leu Asn Lys Gly Thr Pro Met Lys Thr Gin Lys Arg Ala 
40 145 150 155 160 

Ala Leu Tyr Thr Trp Tyr Val Arg Lys Gin Arg Glu Val Ala Gin Gin 
165 170 175 

45 Phe Thr His Ala Gly Gin Gly Gly Leu He Glu Glu Pro Thr Gly Asp 

180 185 190 



Glu Leu Pro Thr Lys Lys Gly Arg Arg Asn Arg Phe Lys Trp Gly Pro 
195 200 205 

Ala Ser Gin Gin He Leu Phe Gin Ala Tyr Glu Arg Gin Lys Asn Pro 
210 215 220 



Ser Lys Glu Glu Arg Glu Thr Leu Val Glu Glu Cys Asn Arg Ala Glu 
55 225 230 235 240 

Cys He Gin Arg Gly Val Ser Pro Ser Gin Ala Gin Gly Leu Gly Ser 
245 250 255 

60 Asn Leu Val Thr Glu Val Arg Val Tyr Asn Trp Phe Ala Asn Arg Arg 

260 265 270 
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Lys Glu Glu Ala Phe Arg His Lys Leu Ala Met Asp Thr Tyr Ser Gly 
275 280 285 



Pro Pro Pro Arg Ala Arg Pro Gly Thr Cys Ala Ala Arg Ser Gin Leu 
290 295 300 

Pro Trp Pro Ala Ser Thr Cys Pro Leu Pro 
305 310 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3236 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: 988 

<D) OTHER INFORMATION: /mod_base= OTHER 
/note= "N = A, C, G, or T" 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: j oin ( 24 . . 986 , 990.. 1271) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

CGTGGCCCTG TGGCAGCCGA GCC ATG GTT TCT AAA CTG AGC CAG CTG CAG 

Met Val Ser Lys Leu Ser Gin Leu Gin 
1 5 

ACG GAG CTC CTG GCG GCC CTG CTC GAG TCA GGG CTG AGC AAA GAG GCA 
Thr Glu Leu Leu Ala Ala Leu Leu Glu Ser Gly Leu Ser Lys Glu Ala 
10 15 20 25 

CTG ATC CAG GCA CTG GGT GAG CCG GGG CCC TAC CTC CTG GCT GGA GAA 
Leu lie Gin Ala Leu Gly Glu Pro Gly Pro Tyr Leu Leu Ala Gly Glu 
30 35 40 

GGC CCC CTG GAC AAG GGG GAG TCC TGC GGC GGC GGT CGA GGG GAG CTG 
Gly Pro Leu Asp Lys Gly Glu Ser Cys Gly Gly Gly Arg Gly Glu Leu 
45 . 50 55 

GCT GAG CTG CCC AAT GGG CTG GGG GAG ACT CGG GGC TCC GAG GAC GAG 
Ala Glu Leu Pro Asn Gly Leu Gly Glu Thr Arg Gly Ser Glu Asp Glu 
60 65 70 

ACG GAC GAC GAT GGG GAA GAC TTC ACG CCA CCC ATC CTC AAA GAG CTG 
Thr Asp Asp Asp Gly Glu Asp Phe Thr Pro Pro He Leu Lys Glu Leu 
75 80 85 

GAG AAC CTC AGC CCT GAG GAG GCG GCC CAC CAG AAA GCC GTG GTG GAG 
Glu Asn Leu Ser Pro Glu Glu Ala Ala His Gin Lys Ala Val Val Glu 
90 95 100 105 
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V 



10 



20 



40 



60 



ACC CTT CTG CAG GAG GAC CCG TGG CGT GTG GCG AAG ATG GTC AAG TCC 3 86 

Thr Leu Leu Gin Glu Asp Pro Trp Arg Val Ala Lys Met Val Lys Ser 
110 115 120 

TAC CTG CAG CAG CAC AAC ATC CCA CAG CGG GAG GTG GTC GAT ACC ACT 434 

Tyr Leu Gin Gin His Asn lie Pro Gin Arg Glu Val Val Asp Thr Thr 
125 130 135 

GGC CTC AAC CAG TCC CAC CTG TCC CAA CAC CTC AAC AAG GGC ACT CCC 482 

Gly Leu Asn Gin Ser His Leu Ser Gin His Leu Asn Lys Gly Thr Pro 

140 145 150 



ATG AAG ACG CAG AAG CGG GCC GCC CTG TAC ACC TGG TAC GTC CGC AAG 530 
Met Lys Thr Gin Lys Arg Ala Ala Leu Tyr Thr Trp Tyr Val Arg Lys 
15 155 160 165 



CAG CGA GAG GTG GCG CAG CAG TTC ACC CAT GCA GGG CAG GGA GGG CTG 57 8 

Gin Arg Glu Val Ala Gin Gin Phe Thr His Ala Gly Gin Gly Gly Leu 
170 175 180 185 

ATT GAA GAG CCC AC A GGT GAT GAG CTA CCA ACC AAG AAG GGG CGG AGG 626 
lie Glu Glu Pro Thr Gly Asp Glu Leu Pro Thr Lys Lys Gly Arg Arg 
190 195 200 



^25 AAC CGT TTC AAG TGG GGC CCA GCA TCC CAG CAG ATC CTG TTC CAG GCC 674 
m Asn Arg Phe Lys Trp Gly Pro Ala Ser Gin Gin lie Leu Phe Gin Ala 

205 210 215 

Jv-| TAT GAG AGG CAG AAG AAC CCT AGC AAG GAG GAG CGA GAG ACG CTA GTG 722 

;:;;30 Tyr Glu Arg Gin Lys Asn Pro Ser Lys Glu Glu Arg Glu Thr Leu Val 

,<?s 220 225 230 

Q GAG GAG TGC AAT AGG GCG GAA TGC ATC CAG AGA GGG GTG TCC CCA TCA 770 

Mi, Glu Glu Cys Asn Arg Ala Glu Cys lie Gin Arg Gly Val Ser Pro Ser 
q35 235 240 245 

[™ CAG GCA CAG GGG CTG GGC TCC AAC CTC GTC ACG GAG GTG CGT GTC TAC 818 

W Gin Ala Gin Gly Leu Gly Ser Asn Leu Val Thr Glu Val Arg Val Tyr 
250 255 260 265 



AAC TGG TTT GCC AAC CGG CGC AAA GAA GAA GCC TTC CGG CAC AAG CTG 866 
Asn Trp Phe Ala Asn Arg Arg Lys Glu Glu Ala Phe Arg His Lys Leu 
270 275 280 



45 GCC ATG GAC ACG TAC AGC GGG CCC CCC CCA GGG CCA GGC CCG GGA CCT 914 

Ala Met Asp Thr Tyr Ser Gly Pro Pro Pro Gly Pro Gly Pro Gly Pro 
285 290 295 

GCG CTG CCC GCT CAC AGC TCC CCT GGC CTG CCT CCA CCT GCC CTC TCC 962 
50 Ala Leu Pro Ala His Ser Ser Pro Gly Leu Pro Pro Pro Ala Leu Ser 
300 305 310 

CCC AGT AAG GTC CAC GGT GTG CGC TNT GGA CAG CCT GCG ACC AGT GAG 1010 
Pro Ser Lys Val His Gly Val Arg Gly Gin Pro Ala Thr Ser Glu 

55 315 320 325 

ACT GCA GAA GTA CCC TCA AGC AGC GGC GGT CCC TTA GTG ACA GTG TCT 1058 
Thr Ala Glu Val Pro Ser Ser Ser Gly Gly Pro Leu Val Thr Val Ser 
330 335 340 



ACA CCC CTC CAC CAA GTG TCC CCC ACG GGC CTG GAG CCC AGC CAC AGC 1106 
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Thr Pro Leu His Gin Val Ser Pro Thr Gly Leu Glu Pro Ser His Ser 

345 350 355 360 

CTG CTG AGT ACA GAA GCC AAG CTG GTC TCA GCA GCT GGG GGC CCC CTC 

Leu Leu Ser Thr Glu Ala Lys Leu Val Ser Ala Ala Gly Gly Pro Leu 
365 370 375 



GGT CAT GAC CAT CGG GCC TGG TGAGCCTGCC TCCCTGGGTC CTACGTTCAC 
Gly His Asp His Arg Ala Trp 
410 415 



1154 



CCC CGT CAG CAC CCT GAC AGC ACT GCA CAG CTT GGA GCA GAC ATC CCC 12 02 

Pro Arg Gin His Pro Asp Ser Thr Ala Gin Leu Gly Ala Asp lie Pro 

10 380 385 390 

AGG CCT CAA CCA GCA GCC CCA GAA CCT CAT CAT GGC CTC ACT TCC TGG 1250 

Arg Pro Gin Pro Ala Ala Pro Glu Pro His His Gly Leu Thr Ser Trp 
395 400 1 405 



1301 



20 


CAACACAGGT 


GCCTCCACCC 


TGGTCATCGG 


CCTGGCCTCC 


ACGCAGGCAC 


AGAGTGTGCP 


i. JUl 


H 


GGTCATCAAC 


AGCATGGGCA 


GCAGCCTGAC 


CACCCTGCAG 


CCCGTCCAGT 


TCTCCCACICC 


1 A O 1 


a 


GCTGCACCCC 


TCCTACCAGC 


AGCCGCTCAT 


GCCACCTGTG 


CAGAGCCATG 


TGACCCAQAd 


1 A P 1 
X4± o X 




CCCCTTCATG 


GCCACCATGG 


CTCAGCTGCA 


GAGCCCCCAC 


GCCCTCTACA 


GCCACAAdCC 


X 3 4t 1 




CGAGGTGGCC 


CAGTACACCC 


ACACGGGCCT 


GCTCCCGCAG 


ACTATGCTCA 


TCACCGACAC 


1 6m 

X 0 VJ X 




CACCAACCTG 


AGCGCCCTGG 


CCAGCCTCAC 


GCCCACCAAG 


CAGGTCTTCA 


CCTCAGACAC 


-L O D A. 




TGAGGCCTCC 


AGTGAGTCCG 


GGCTTCACAC 


GCCGGCATCT 


CAGGCCACCA 


CCCTCPAPf^T 


X / Z X 




CCCCAGCCAG 


GACCCTGCCG 








TCAGCGCCAG 


1781 




CCCCACAGTG 


TCCTCCAGCA 


GCCTGGTGCT 


GTACCAGAGC 


TCAGACTCCA 


GCAATGGCCA 


1841 




GAGCCACCTG 


CTGCCATCCA 


ACCACAGCGT 


CATCGAGACC 


TTCATCTCCA 


CCCAGATGGC 


1901 


40 


CTCTTCCTCC 


CAGTAACCAC 


GGCACCTGGG 


CCCTGGGGCC 


TGTACTGCCT 


GCTTGGGGGG 


1961 




TGATGAGGGC 


AGCAGCCAGC 


CCTGCCTGGA 


GGACCTGAGC 


CTGCCGAGCA 


ACCGTGGCCC 


-2021 


45 


TTCCTGGACA 


GCTGTGCCTC 


GCTCCCCACT 


CTGCTCTGAT 


GCATCAGAAA 


GGGAGGGCTC 


2081 




TGAGGCGCCC 


CAACCCGTGG 


AGGCTGCTCG 


GGGTGCACAG 


GAGGGGGTCG 


TGGAGAGCTA 


2141 




GGAGCAAAGC 


CTGTTCATGG 


CAGATGTAGG 


AGGGACTGTC 


GCTGCTTCGT 


GGGATACAGT 


2201 


50 


CTTCTTACTT 


GGAACTGAAG 


GGGGCGGCCT 


ATGACTTGGG 


CACCCCCAGC 


CTGGGCCTAT 


2261 




GGAGAGCCCT 


GGGACCGCTA 


CACCACTCTG 


GCAGCCACAC 


TTCTCAGGAC 


ACAGGCCTGT 


2321 


55 


GTAGCTGTGA 


CCTGCTGAGC 


TCTGAGAGGC 


CCTGGATCAG 


CGTGGCCTTG 


TTCTGTCACC 


2381 


AATGTACCCA 


CCGGGCCACT 


CCTTCCTGCC 


CCAACTCCTT 


CCAGCTAGTG 


ACCCACATGC 


2441 




CATTTGTACT 


GACCCCATCA 


CCTACTCACA 


CAGGCATTTC 


CTGGGTGGCT 


ACTCTGTGCC 


2501 


60 


AGAGCCTGGG 


GCTCTAACTG 


CCTGAGCCCA 


GGGAGGCCGA 


AGCTAACAGG 


GAAGGCAGGC 


2561 



A: 12(r761(2L6H()l!.DOQ 



-217- 



10 



AGGGCTCTCC TGGTCTTCCC ATCCCCAGCG ATTCCCTCTC CCAGGCCCCA TGACCTCCAG 2 621 

CTTTCCTGTA TTTCTTCCCA AGAGCATGAT GCCTCTGAGG CCAGCCTGGC CTCCTGCCTC 2681 

TACTGGGAAG GCTACTTCGG GGCTGGGAAG TCGTCCTTAC TCCTGTGGGA GCCTCGCAAC 27 41 

CCGTGCCAAG TCCAGGTCCT GGTGGGGCAG CTCCTCTGTC TCGAGCGCCC TGCAGACCCT 2 801 

GCCCTTGTTT GGGGCAGGAG TAGCTGAGCT CACAAGGCAG CAAGGCCCGA GCAGCTGAGC 2 8 61 

AGGGCCGGGG AACTGGCCAA GCTGAGGTGC CCAGGAGAAG AAAGAGGTGA CCCCAGGGCA 2921 

CAGGAGCTAC CTGTGTGGAC AGGACTAACA CTCAGAAGCC TGGGTGCCTG GCTGGCTGAG 2981 

15 GGCAGTTCGC AGCCACCCTG AGGAGTCTGA GGTCCTGAGC ACTGCCAGGA GGGACAAAGG 3 041 

AGCCTGTGAA CCCAGGACAA GCATGGTCCC ACATCCCTGG GCCTGCTGCT GAGAACCTGG 3101 

CCTTCAGTGT ACCGCGTCTA CCCTGGGATT CAGGAAAAGG CCTGGGGTGA CCCGGCACCC 3161 

CCTGCAGCTT GTAGCCAGCC GGGGCGAGTG GCACGTTTAT TTAACTTTTA GTAAAGTCAA 3221 

GGAGAAATGC GGTGG 3235 



20 



/25 



45 



60 



(2) INFORMATION FOR SEQ ID NO : 8 ; 



if; (i) SEQUENCE CHARACTERISTICS: 
O (A) LENGTH: 415 amino acids 

ijfO (B) TYPE: amino acid 

f !" (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

CI55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

ps Met Val Ser Lys Leu Ser Gin Leu Gin Thr Glu Leu Leu Ala Ala Leu 

\1 15 10 15 

40 Leu Glu Ser Gly Leu Ser Lys Glu Ala Leu He Gin Ala Leu Gly Glu 

20 25 30 



Pro Gly Pro Tyr Leu Leu Ala Gly Glu Gly Pro Leu Asp Lys Gly Glu 

35 40 45 

Ser Cys Gly Gly Gly Arg Gly Glu Leu Ala Glu Leu Pro Asn Gly Leu 

50 55 60 



Gly Glu Thr Arg Gly Ser Glu Asp Glu Thr Asp Asp Asp Gly Glu Asp 

50 65 70 75 80 

Phe Thr Pro Pro He Leu Lys Glu Leu Glu Asn Leu Ser Pro Glu Glu 

85 90 95 

55 Ala Ala His Gin Lys Ala Val Val Glu Thr Leu Leu Gin Glu Asp Pro 

100 105 110 



Trp Arg Val Ala Lys Met Val Lys Ser Tyr Leu Gin Gin His Asn He 
115 120 125 

Pro Gin Arg Glu Val Val Asp Thr Thr Gly Leu Asn Gin Ser His Leu 
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130 135 140 

Ser Gin His Leu Asn Lys Gly Thr Pro Met Lys Thr Gin Lys Arg Ala 
145 150 155 160 

Ala Leu Tyr Thr Trp Tyr Val Arg Lys Gin Arg Glu Val Ala Gin Gin 
165 170 175 

Phe Thr His Ala Gly Gin Gly Gly Leu He Glu Glu Pro Thr Gly Asp 
180 185 190 

Glu Leu Pro Thr Lys Lys Gly Arg Arg Asn Arg Phe Lys Trp Gly Pro 
195 200 205 

Ala Ser Gin Gin He Leu Phe Gin Ala Tyr Glu Arg Gin Lys Asn Pro 
210 215 220 

Ser Lys Glu Glu Arg Glu Thr Leu Val Glu Glu Cys Asn Arg Ala Glu 
225 230 235 240 

Cys He Gin Arg Gly Val Ser Pro Ser Gin Ala Gin Gly Leu Gly Ser 
245 250 255 

Asn Leu Val Thr Glu Val Arg Val Tyr Asn Trp Phe Ala Asn Arg Arg 
260 265 270 

Lys Glu Glu Ala Phe Arg His Lys Leu Ala Met Asp Thr Tyr Ser Gly 
275 280 285 

Pro Pro Pro Gly Pro Gly Pro Gly Pro Ala Leu Pro Ala His Ser Ser 
290 295 300 

Pro Gly Leu Pro Pro Pro Ala Leu Ser Pro Ser Lys Val His Gly Val 
305 310 315 320 

Arg Gly Gin Pro Ala Thr Ser Glu Thr Ala Glu Val Pro Ser Ser Ser 
325 330 335 

Gly Gly Pro Leu Val Thr Val Ser Thr Pro Leu His Gin Val Ser Pro 
340 345 350 

Thr Gly Leu Glu Pro Ser His Ser Leu Leu Ser Thr Glu Ala Lys Leu 
355 360 365 

Val Ser Ala Ala Gly Gly Pro Leu Pro Arg Gin His Pro Asp Ser Thr 
370 375 380 

Ala Gin Leu Gly Ala Asp He Pro Arg Pro Gin Pro Ala Ala Pro Glu 
385 390 395 400 

Pro His His Gly Leu Thr Ser Trp Gly His Asp His Arg Ala Trp 
405 410 415 
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(2) INFORMATION FOR SEQ ID NO: 9: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : modif ied„base 

(B) LOCATION: 7 

(D) OTHER INFORMATION: /mod„base= OTHER 
/note= "N = A, C ( G, or T" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GTTAATNATT ACC 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
TACACCACTC TGGCAGCCAC ACT 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CGGTGGGTAC ATTGGTGACA GAAC 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GGCAGGCAAA CGCAACCCAC G 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 
GAAGGGGGGC TCGTTAGGAG C 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 

CATGCACAGT CCCCACCCTC A 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 

CTTCCAGCCC CCACCTATGA G 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
GGGCAAGGTC AGGGGAATGG A 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 
CAGCCCAGAC CAAACCAGCA C 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO; 18 
CAGAACCCTC CCCTTCATGC C 



(2) INFORMATION FOR SEQ ID NO: 19: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 

GGTGACTGCT GTCAATGGGA C 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
GGCAGACAGG CAGATGGCCT A 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GCCTCCCTAG GGACTGCTCC A 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 
TGGAGCAGTC CCTAGGGAGG C 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 
GTTGCCCCAT GAGCCTCCCA C 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 
GGTCTTGGGC AGGGGTGGGA T 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
CTGCAATGCC TGCCAGGCAC C 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 
CCCCTGCATC CATTGACAGC C 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 
GAGGCCTGGG ACTAGGGCTG T 



(2) INFORMATION FOR SEQ ID NO: 28: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 
CTCTGTCACA GGCCGAGGGA G 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 
CCTGTGACAG AGCCCCTCAC C 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 
CGGACAGCAA CAGAAGGGGT G 
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(2) INFORMATION FOR SEQ ID NO: 31: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
{C} STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

CAGAGCCCCT CACCCCCACA T 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
GTACCCCTAG GGACAGGCAG G 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
ACCCCCCAAG CAGGCAGTAC A 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 671 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 104.. 217 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GCAGAGAGGG CACTGGGAGG AGGCAGTGGG AGGGCGGAGG GCGGGGGCCT TCGGGGTGGG 
CGCCCAGGGT AGGGCAGGTG GCCGCGGCGT GGAGGCAGGG AGA ATG CGA CTC TCC 
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Met Arg Leu Ser 
1 

AAA ACC CTC GTC GAC ATG GAC ATG GCC GAC TAC AGT GCT GCA CTG GAC 163 
Lys Thr Leu Val Asp Met Asp Met Ala Asp Tyr Ser Ala Ala Leu Asp 
5 10 15 20 

CCA GCC TAC ACC ACC CTG GAA TTT GAG AAT GTG CAG GTG TTG ACG ATG 211 
Pro Ala Tyr Thr Thr Leu Glu Phe Glu Asn Val Gin Val Leu Thr Met 
25 -30 35 

GGC AAT GGTAGGTGGG GGCAGATGTG CCCAGGTGTG CCAGTGGGGG CAGGTGTGCC 267 
Gly Asn 

TGGGTCCAGG AGCAGATCTT TGGCACTCAA CTTTGGGGTG GGAGGAGAAT GATACAAAAT 327 

GGTAGGTTGG TCCTACAGGC CAGCACAGGT GTTGCCAAGT GAAGCCCATG TGCCCAGGCA 3 87 

CAGTGATCAC AGGCATTCTG GGTGAAGGGA GGCCTGCAAG GGCCAATTTC CAGCAAAAGT 447 

CGATCCCGGC TATTCCTCCC AGGCCCTTCC AGTCCTCACT GCCTCACAGT GGCTCTGCTT 507 

GGCGCTTGGC ACAGTGACAT GATGGTGAGC TCCCCCTTGG TGCCCAGCTC CAGCGATTCA 567 

GCCCAGCACG GCCCCTTCGT GAACCCCTTG GGCCTAGGTT CAGAGAGACG GCAAGGGATG 627 
TTGTATCCCT GGAGATGGTG GTTGGAGACA TAACCGCATT TCTC 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Met Arg Leu Ser Lys Thr Leu Val Asp Met Asp Met Ala Asp Tyr Ser 
1 5 10 15 

Ala Ala Leu Asp Pro Ala Tyr Thr Thr Leu Glu Phe Glu Asn Val Gin 
20 25 30 

Val Leu Thr Met Gly Asn 
35 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 796 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



671 
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(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: join (286 312 , 316.. 375) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

TGGATGTTTG TACATGTGTG CTGTGTGTGC GGGTCATAGA GCACATGTGT TTGTGCATGC 

GGACCTGTTG GAGTGCCCTG TTCTTCCTGC ATCTTTATCC TGTATGGGCG TTTTGTCGTG 

TGCCCATATT TGTACCTGCT GTGTATATAT GCAGTTCCCT GTGCTGCGGG CGGGGGTCAG 

CGGTCTCTGG TGTGCACGAC TGCACAGACC CAAATGCAGG ACTCTGTTGT TGCCACTCAC 

CAAGTGAGAT TCATATCAGC AACATGTCCG TTTGTCTCTG AGCAG ATT TGT TGC 

lie Cys Cys 
1 

CGC TGC GTC TCG CCA GAT TGA GGC ATC CCC TCC GAC ATC ACT GGA GCA 
Arg Cys Val Ser Pro Asp Gly He Pro Ser Asp He Thr Gly Ala 

5 10 15 

TAT CTG GAG GGG TGG ACA GTT CTC CAC AGG GAG GTAGGGGAAA AGAGGAGGCC 
Tyr Leu Glu Gly Trp Thr Val Leu His Arg Glu 
20 25 

CGGAAACCCC TCCTGGAGGG AAGAGCCCCA TCGGTCCCAG GCCAGCCTCA GAGGAGAGGG 

GGCAGGCAGC TGGCTGAGGT CAGCCTGCCA CCCTGCTTCC TTCTGTGTCT TGGAGCCACT 

CAGCCAGTAT GAGGCTGCAG CTCCAGCTGA GGTCTGGAAT CTTGTGGTCA GCTCAGCTAG 

GGTGAGGAGG CAGCTGCTGG GCACTGCTTG TTGTCAGCTC AGCAGGTGCT CACCTGCCCC 

TGCCGTCCAG TCACGTGTGA CCTTGGGCAT GTCACCTCCC CTATCCTGGC TTCTGTATCT 

TCTACAAAAC AGGCTTCATT CCCCCAGGCC TGCTGGCTGG ACGGCTTTTA GGCCTGTCTG 

AGGACCACGC CAGGAGCGCA AGGCAAAAAC ACACCAGAGA T 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 

He Cys Cys Arg Cys Val Ser Pro Asp Gly He Pro Ser Asp He Thr 
1 5 10 15 

Gly Ala Tyr Leu Glu Gly Trp Thr Val Leu His Arg Glu 
20 25 



-227- 

A: 12076K2UHOILDOC) 



(2) INFORMATION FOR SEQ ID NO : 3 8 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



10 (ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 326.. 499 



15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

CCCCTTGCGA GTTAGGAGGC CGGCTCCCAC CCCAGAAGGT GGCCAGGTTT TCATGCCTTC 60 

^ CTAGAGAAAG CTGGGGCTGG TGGCCTCCAC CACAGGGAGA CGCAGACCCT CAGAAACAAG 120 

TCTGTGAAGT CACAACCAGC CCCAGTTTAC AGATGTGAAA CTGAAGCTCC AAAAAGTCAG 180 

^ GAGGTCACTG AGTGGGGAGG TGATGGAGTG GAACAGCCCC CAGATCTGGC TGAGGCCGAA 240 
S|> GCCCTGGAGA GATCCCCGCA AGGCTCCCTT AGATGCCTGA CATTCTGTTC TTCCTGAAGC 



=35 



40 



45 



50 



(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 58 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



300 



CTCACTCCCT TCTCTCCTGG CGCAG ACA CGT CCC CAT CAG AAG GCA CCA ACC 352 

Thr Arg Pro His Gin Lys Ala Pro Thr 
1 5 



400 



TCA ACG CGC CCA ACA GCC TGG GTG TCA GCG CCC TGT GTG CCA TCT GCG 
Ser Thr Arg Pro Thr Ala Trp Val Ser Ala Pro Cys Val Pro Ser Ala 
10 15 20 25 

GGG ACC GGG CCA CGG GCA AAC ACT ACG GTG CCT CGA GCT GTG ACG GCT 448 
Gly Thr Gly Pro Arg Ala Asn Thr Thr Val Pro Arg Ala Val Thr Ala 
30 35 40 

GCA AGG GCT TCT TCC GGA GGA GCG TGC GGA AGA ACC ACA TGT ACT CCT 496 
Ala Arg Ala Ser Ser Gly Gly Ala Cys Gly Arg Thr Thr Cys Thr Pro 
45 50 55 

GCA GGTGAGGAGC CTCAATTTCT TCAGCTGGGA AATGGGCACA CTTGGGCTCA c / q 
Ala 

TGGCCCCAAG GTCTGTCTTC TCCCTGAGTG GGTAGGTCCC AGAGACAGCT GCCCTTCAGG 609 
GCCTTCAAGG CTCTTCTGGT TTTGT 

(2) INFORMATION FOR SEQ ID NO: 39: 



60 



<ii) MOLECULE TYPE: protein 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Thr Arg Pro His Gin Lys Ala Pro Thr Ser Thr Arg Pro Thr Ala Trp 
15 10 15 

Val Ser Ala Pro Cys Val Pro Ser Ala Gly Thr Gly Pro Arg Ala Asn 
20 25 30 

Thr Thr Val Pro Arg Ala Val Thr Ala- Ala Arg Ala Ser Ser Gly Gly 
35 40 _. 45 

Ala Cys Gly Arg Thr Thr Cys Thr Pro Ala 
50 55 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 458 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



<ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: j oin ( 171 . . 173 , 177.. 265) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 
AGAGAGTTCA TAGCACCTTT CCAGCTCCTG GTGGGTTCAA GAGAGAACTC CCGGGATGAA 
GAGATGAGAG CACTGAGGTT GGGGGGTCAA CTGGATAGCC AGGGCCCTAG TTCTGTCCTA 
AGAGGAGGAA GTTGTGTCTT CTCCATCCAA CCATCCAAAG CCCTCCCCAG ATT 

lie 

1 

TAG CCG GCA GTG CGT GGT GGA CAA AGA CAA GAG GAA CCA GTG CCG CTA 
Pro Ala Val Arg Gly Gly Gin Arg Gin Glu Glu Pro Val Pro Leu 
5 10 15 

CTG CAG GCT CAA GAA ATG CTT CCG GGC TGG CAT GAA GAA GGA 
Leu Gin Ala Gin Glu Met Leu Pro Gly Trp His Glu Glu Gly 
20 25 30 

AGGTGAGCCT CGGCCCTCCC CGCCCCACCA CCACTGCCCC ACCTGCACCC ACAGCTCCCC 

GACAGTCATT TACAACTGTA GCCACACTTT ATGACTCAGT GGCAGGCCCC AGGGTGACTG 

GCTAATGGCT GAGAAGAGGG AGGGCCTGGA AATCTGACCA TAGGGAGCGG CTGGGCTTGG 

TCTTGAGAAA GATTC 
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(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 30 amino acids 
5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

He Pro Ala Val Arg Gly Gly Gin Arg Gin Glu Glu Pro Val Pro Leu 
1 5 10 15 

15 Leu Gin Ala Gin Glu Met Leu Pro Gly Trp His Glu Glu Gly 

20 25 30 



20 



(2) INFORMATION FOR SEQ ID NO: 42: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 662 base pairs 

(B) TYPE: nucleic acid 
^ (C) STRANDEDNESS : single 
2S (D) TOPOLOGY: linear 



(ix) FEATURE: 
rl (A) NAME /KEY: CDS 

3££ (B) LOCATION: 84.. 188 



□ (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

3j-5| TCCCACTCCT CATCAGTCAC AGACACCCCC ACCCCCTACT CCATCCCTGT TCTCCCTCCT 60 

Jt[ CACCTCTCTG TGCCTCCTCA CAG CCG TCC AGA ATG AGC GGG ACC GGA TCA 110 

! !ss * Pro Ser Arg Met Ser Gly Thr Gly Ser 

40 

GCA CTC GAA GGT CAA GCT ATG AGG ACA GCA GCC TGC CCT CCA TCA ATG 158 
Ala Leu Glu Gly Gin Ala Met Arg Thr Ala Ala Cys Pro Pro Ser Met 
10 15 20 25 

45 CGC TCC TGC AGG CGG AGG TCC TGT CCC GAC AGGTACCGGG GTGATCCTGC 208 
Arg Ser Cys Arg Arg Arg Ser Cys Pro Asp 



50 



55 



60 





30 




35 








CACCCACCCA 


GGGGATCCCC 


CACACTACAG 


AGGAGCTCAC 


CTCCTCCACC 


TCCATTCTCC 


268 


CCAGCCAGGC 


CCTGGAGCAG 


CTGACGGGAG 


GGGCCTCAGA 


TATTACAGAA 


GGGACACTGA 


328 


GTGCGGTTTC 


ACATGGCCCA 


GTTTGCAGCA 


AGGGCAGGAA 


TCGAACCTGG 


CGCCCTGGGG 


388 


CACTTTCTAA 


TTCATCCTAC 


TGCCTGCATC 


CCACAGGCCA 


AGCAGAGTCT 


TCACCTTCAC 


448 


TGAGGGCCTG 


CGATCAGCTC 


AGCTCCGAGA 


GAACAGAGCA 


GTGGCTCAGT 


GGAGAGAGGT 


508 


GGCAAAGTGG 


GGCCCAGCCC 


TTCCCTTGCT 


GAGTGACCTT 


GGGCAAGTCA 


CAGCACCTCT 


568 


CTGAGCCATG 


GTTGCCTCAT 


TGTCAGAAAA 


GGATGATGAT 


TTTTTGCCCT 


GCTTCTCCTC 


628 
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TAAGGCTGAC AGACTCCTTG GGGCTCTAAA GCTG S62 



5 {2} INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 amino acids 

(B) TYPE: amino acid 
10 (D) TOPOLOGY: linear 



15 



23 



45 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Pro Ser Arg Met Ser Gly Thr Gly Ser Ala Leu Glu Gly Gin Ala Met 
15 10 15 



Arg Thr Ala Ala Cys Pro Pro Ser Met Arg Ser Cys Arg Arg Arg Ser 
20 20 25 30 



Cys Pro Asp 
35 

(2) INFORMATION FOR SEQ ID NO: 44: 

H b (i) SEQUENCE CHARACTERISTICS: 

Q (A) LENGTH: 647 base pairs 

30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

!U (D) TOPOLOGY: linear 

; liSi? 

35 (ix) FEATURE: 

a! (A) NAME /KEY: CDS 

Q (B) LOCATION: 185.. 340 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

TTCTCCCTCA TCCCTGCCTC CTCCCTCCCT CCGTTTTTAC CCTGAGCTTC CTTCAGAGCT 60 

GGAGGGCACC CACTATCCAG CCCCCTCCCC ACATCTGATT CCAGGGAGGG GGCTCTGTGC 120 

AGGGGACAGA GAATGCGGGA GGGCCCGGAC ATCTCCAGCA TTTTCTTCCC TGTATCTCTC 180 

GAAG ATC ACC TCC CCC GTC TCC GGG ATC AAC GGC GAC ATT CGG GCG AAG 229 
He Thr Ser Pro Val Ser Gly He Asn Gly Asp He Arg Ala Lys 
50 1 5 10 15 

AAG ATT GCC AGC ATC GCA GAT GTG TGT GAG TCC ATG AAG GAG CAG CTG 277 
Lys He Ala Ser He Ala Asp Val Cys Glu Ser Met Lys Glu Gin Leu 

55 20 

CTG GTT CTC GTT GAG TGG GCC AAG TAC ATC CCA GCT TTC TGC GAG CTC 325 
Leu Val Leu Val Glu Trp Ala Lys Tyr He Pro Ala Phe Cys Glu Leu 
35 40 45 

60 CCC CTG GAC GAC CAG GTGAGGATGG GCGTGGATGG TGGGCAGTAG TGGGCAGTGG 380 

Pro Leu Asp Asp Gin 
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GCGGGGCAGC CAGGGGGCTG CTGGCCCACC TGGGATATAG CCGTGGACTG GCTTGATTTT 

ATTTTATTTA ACAAAATATG TAGTGCACAC ACGTGTCTGA AACTTTAAAT CACCTTACAA 

ATATTAACTC AGTTAGCTCC TCCAACAACT CTATGAGGTA GGTACTAAGG TACTATTATT 

ACTGCCATCT CATAGGTGAG AGATTGGGGC ACAGAGAGGT TAAGTAACCT GCTCAAGGTC 
ACATAGCTAC TATCCAGCAT AGCTGGG 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

He Thr Ser Pro Val Ser Gly He Asn Gly Asp He Arg Ala Lys Lys 
15 10 15 

He Ala Ser He Ala Asp Val Cys Glu Ser Met Lys Glu Gin Leu Leu 
20 25 30 

Val Leu Val Glu Trp Ala Lys Tyr He Pro Ala Phe Cys Glu Leu Pro 
35 40 45 

Leu Asp Asp Gin 
50 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 844 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 429.. 515 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
ATTTTTACAA AGCACCCTTC ATAATTCTCC ATAGCTGGTC CATGGGTGGG AATTTGGGAC 
CCACAGTTTT GGAACTTTTT GGGATCATAG ACCTTTTTGA GAATCTCAAA AAAGAAAAAA 
AAGCACACAG AATGTTGCTT ACAGTTTCAT CAGGCACACA GAAGAGGCCC AGCACGAAGC 
AGTTTCTTGC CCAAGGACAC AGCAGTTCAA GGACAGAGTC AGCGCGAGGT CTCTCAGCTC 
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TGAGCACATG TTCTTTCCCC TTCCAGGTTT CTAGTTTTAT GGGTAGTAGT TTTATGATGC 

CCATTTCACA GTTCAGGCAG GTAGAGGCAG AGGGGAGCAT TAAGCTGACT TGCCCAGCGT 

CACTGAGTTG GCTACGGGCA GCCTTCCCAA GGGTACAGAT GGCAAACACT GTTCCTTATC 

TCTTTCAG GTG GCC CTG CTC AGA GCC CAT GCT GGC GAG CAC CTG CTG CTC 
Val Ala Leu Leu Arg Ala His Ala Gly Glu His Leu Leu Leu 
15 10 

GGA GCC ACC AAG AGA TCC ATG GTG TTC AAG GAC GTG CTG CTC CTA 
Gly Ala Thr Lys Arg Ser Met Val Phe Lys Asp Val Leu Leu Leu 
15 20 25 

GGTGAGGCGG CTGCCTGCCC TGGCCAGGGC TCCAGGGAGG GTATGCCTAG CATGGCACTC 

ACCCAGGCAA GGAGATTCAC ATGGTGGCAT GCAAGGGTGA GGGAGACTAG TCAGGAGTGG 

CCCTGTCCTC AGGCTTGCAT TGGAGGGCTC CAGGACTCAG TTTTCAACTG GGTACCCCAC 

TCAGATGCAA GGAAATGTGG ATGCAAGTCA CCAAATTCCC AGCATTGAAG TCAGAGCACG 

ATCAGGGTTA TCCCTGGAAT TACCTGTGCA TCCTTTTTTC TTTTGACAGA GTCTTGCTCT 

GTCACTCAGG CTGGAGTGCA ATGATGTGA 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 29 amino acids 
(B) TYPE: amino acid 
{D} TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Val Ala Leu Leu Arg Ala His Ala Gly Glu His Leu Leu Leu Gly Ala 
15 10 15 

Thr Lys Arg Ser Met Val Phe Lys Asp Val Leu Leu Leu 
20 25 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 937 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: join(485 . . 529 , 533.. 640) 
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10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

GCAACACTAG TATTTTAATA TAACAATGCT ATGAGGGAGC TCGATTATTT ATCCTCATCT 60 

TATAGATAAG AAAAC TGAGG CACAGAGAGG TTAAGTAACT TATCCAACTA TAACCAGCTA 120 

TCAGGGGCAG AGCCATTTAA GCAGGGCAGT GCAGTTCCAG AATCTGGTCC TTTAACCTTG 180 

ATGCTTTGGT GCCTATCAGG TGACCTTTGA ATGTCATCGA TCTTGTGAGT CATGTTGGTA 2 40 

AATGGAGCTT GGGTCATGTG AAAGAGGTCC TAGAAAGCCA AGTTCCAAGC TCAGCCGGAT 300 

GACTCAAGGC AGCTTATCTT CTGAATCTGG GCCTCAGCTT CCTTACCTGT GAAATGGGAG 360 

15 TCACCATCCC TGCAGGTCCT CCTCCCACAG GCACCAGCTA TCTTGCCAAC TTAAAAGCCA 420 

AAACTAGAGG AGAGGGGTCA ACCCAAAGTG ACTTCCCATC CTCCCTCCCT CCCAACCCTT 480 

CCAG GCA ATG ACT ACA TTG TCC CTC GGC ACT GCC CGG AGC TGG CGG AGA 529 
^ Al a Met Thr Thr Leu Ser Leu Gly Thr Ala Arg Ser Trp Arg Arg 

15 10 15 

Q TGA GCC GGG TGT CCA TAC GCA TCC TTG ACG AGC TGG TGC TGC CCT TCC ' 577 
Jti Ala Gly Cys Pro Tyr Ala Ser Leu Thr Ser Trp Cys Cys Pro Ser 

^ 20 25 30 

P AGG AGC TGC AGA TCG ATG ACA ATG AGT ATG CCT ACC TCA AAG CCA TCA 625 
Z.. Ar S Ser Cys Arg Ser Met Thr Met Ser Met Pro Thr Ser Lys Pro Ser 
r 35 40 45 



45 



60 



TCT TCT TTG ACC CAG GTACAGTGCA CACCTCCTAA GCCATCCCTG ACTCTCTCTC 
Ser Ser Leu Thr Gin 
50 



(2) INFORMATION FOR SEQ ID NO: 49: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 
50 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



Ala Met Thr Thr Leu Ser Leu Gly Thr Ala Arg Ser Trp Arg Arg Ala 
15 10 15 

Gly Cys Pro Tyr Ala Ser Leu Thr Ser Trp Cys Cys Pro Ser Arg Ser 
20 25 30 



680 



3j3: CAGAACGCTC TGCCAGACTT CTCCTATTGG GTTCTGTACA CTGAGTTCAC AGCCTCATCT 740 

W CATGTTAACG AC AGC C AGG A GAGGCCGTTT TCATTTAACA GATGAGGCAA GTCAAGATTT 800 

Jjj, GAAGAGACAA TATGGCCGGG CGCAGTGGCT CACACCTGTA ATCCCATCAC TTTGGGAGGC 860 

TGAGGCGGGC GGATCACCTG AGGTCAGGGG TCAAGATGAG CCTGGCTAAC ATGGAGAAAC 920 

CCCATCTCTA CTTAAAA 



937 
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Cys Arg Ser Met Thr Met Ser Met Pro Thr Ser Lys Pro Ser Ser Ser 
35 40 45 



Leu Thr Gin 
50 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 978 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: j oin { 3 7 6 . . 3 87 , 391.. 432, 436.. 534, 538.. 610) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 



GTGGCTCTGC 


CAACAACTGG 


CTGTGCGACC 


CAGGACAAGT CCTATCTTTG 


CACTGTGTCT 


GGGTTTCCCC 


GTGTGTAAGA 


TGAGGCGGTT 


GCTAGGTGCT TATTGGATGC 


ATTCCTCAAG 


TCCCGCCCTC 


CATCTCCTAT 


TCCCCTCTCT 


TCTGGTTTAG TGCTTTAGGA 


AATGTGGCAG 


AAATCTTTTT 


CTGCCTGTGT 


CTAGGAAATC 


ATAATTCATG CTGGCGTACC 


CTGGTTGTTG 


AGGTCCCTGA 


ATCCTTGTGC 


CCACACTGCT 


GAAGACTCCT TGTGTGACAC 


AAGTCAGGGG 


ACATCTGGGT 


CTTGACTCCC 


CAGATGCTCC 


AGGTGGACCC TGCTGCCCTC 


CCTTGCCCAC 


CCTCTTCCAT 


TGTAG ATG CCA AGG GGC 
Met Pro Arg Gly 


TGA GCG ATC CAG GGA AGA TCA AGC 
Ala lie Gin Gly Arg Ser Ser 



1 5 10 



GGC TGC GTT CCC AGG TGC AGG TGA GCT TGG AGG ACT ACA TCA ACG ACC 
Gly Cys Val Pro Arg Cys Arg Ala Trp Arg Thr Thr Ser Thr Thr 

15 20 25 

GCC AGT ATG ACT CGC GTG GCC GCT TTG GAG AGC TGC TGC TGC TGC TGC 
Ala Ser Met Thr Arg Val Ala Ala Leu Glu Ser Cys Cys Cys Cys Cys 
30 35 40 

CCA CCT TGC AGA GCA TCA CGT GGC AGA TGA TCG AGC AGA TCC AGT TCA 
Pro Pro Cys Arg Ala Ser Arg Gly Arg Ser Ser Arg Ser Ser Ser 

45 50 55 

TCA AGC TCT TCG GCA TGG CCA AGA TTG ACA ACC TGT TGG AGG AGA TGC 
Ser Ser Ser Ser Ala Trp Pro Arg Leu Thr Thr Cys Trp Arg Arg Cys 
60 65 70 

TGC TGG GAGGTCCGTG CCAAGCCCAG GAGGGGCGGG GTTGGATTGG GGACTCCCCA 
Cys Trp 
75 



GGAGACAGGC CTCACACAGT GAGCTCACCC CTCAGCTCCT TGGCTTCCCC ACTGTGCCGC 
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TTTGGGCAAG TTGCTTAACC TGTCTGTGCC TCAGTTTCCT CACCAGAAAA ATGGGAACAA 
GGCAATGGTC TATTTGTTCA GGCACCGAGA ACCTAGCACG TGCCAGTCAC TGTTCTAAGT 
GCTGGCAATT CAGCAAAGAA CAAGATCTTT GCCCTCGGGG AGGCTGTGTG TGTGTGATAT 
GTATGGATGC GTGGATATCT GTGTATATGC CCGTATGTGC GTGCATGTGT ATATAAAGCC 
TCACATTTTA TGATTTTGA 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

Met Pro Arg Giy Ala lie Gin Gly Arg Ser Ser Gly Cys Val Pro Arg 
15 10 15 

Cys Arg Ala Trp Arg Thr Thr Ser Thr Thr Ala Ser Met Thr Arg Val 
20 25 30 

Ala Ala Leu Glu Ser Cys Cys Cys Cys Cys Pro Pro Cys Arg Ala Ser 
35 40 45 

Arg Gly Arg Ser Ser Arg Ser Ser Ser Ser Ser Ser Ser Ala Trp Pro 
50 55 60 

Arg Leu Thr Thr Cys Trp Arg Arg Cys Cys Trp 
65 70 75 



<2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 84 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: join ( 443 .. 490 , 494.. 595) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
GGGACACATA GATGCTATAA GTAGGTCAGT TGGCTGCAGC AGAGATGTGG GGGATGAGGC 
TGAAAGGTGA GGCGGGACCA AATGGTTGAA GGACTTGCAC TCCAAGGAGC TTTGAGAGCC 
ATTGATTACA TCCATTATGT TACTATGTGA CCAATACATT ACTCATTAGA ACATTTACGT 
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GATCTCAGAG CTTCCTTATA TGCACCTTGT TCCTTTCAAC TCACTTTTGT TCTCTTGGTT 

TTTTGGGGTC CTCTTAACAC CCTCATGAAG TCTATAGATG GGAATGGTAC ACCCTAGTTT 

ACTAACCCAG GAATAGGTAC CCAACAGGCA CTGCCAATAT TGGATGGGCT GGTTGATTGG 

CCACGCCTGA GGAAGATGGC GTCCCAAGGC CTGAGGTCTG CATCCCAGAC TCTCCATCCT 

GATCGACCTT CTCTACCTGC AG GGT CCC CCA GCG ATG CAC CCC ATG CCC ACC 

Gly Pro Pro Ala Met His Pro Met Pro Thr 
15 10 

ACC CCC TGC ACC CTC ACC TGA TGC AGG AAC ATA TGG GAA CCA ACG TCA 
Thr Pro Cys Thr Leu Thr Cys Arg Asn He Trp Glu Pro Thr Ser 

15 20 25 

TCG TTG CCA ACA CAA TGC CCA CTC ACC TCA GCA ACG GAC AGA TGT GTG 
Ser Leu Pro Thr Gin Cys Pro Leu Thr Ser Ala Thr Asp Arg Cys Val 
30 35 40 

AGT GGC CCC GAC CCA GGG GAC AGG CAG GTGGGCAAAC TCTGGGATTT 
Ser Gly Pro Asp Pro Gly Asp Arg Gin 
45 50 

TACCTTGCAA AGGGTGAGGA TGGGGCTTAA GACAGGAGGC AGGAGAAAGT GGAGTCTAGA 

AGGTAGAACC AGGATGCAAC AGTTTTCTGG GTTCCAGGGT AGGGAATAAA GGGCAAGATT 

GTCCATTTGT TGAGGCTGTT TATTCAGTAA GGTGACTGAC AGCCTTTACT GAATGAAGCC 

ATTGTTGGGA TGAGGCAATC CACTGGATGA GGTAACCCAT TGGGTGAAGA TGTCTTGGGT 

GAGAATTCCA TTAGTTGACA TTGTCCATTA AGTAAAAGTG GTCATTGAAG TAAGGCTGCA 

CAGTTGGGTA AGGCTATCCA TTAGACATTA GATGAGACTA CCCATTGGGT CAGGATGTCT 

GCTGGGCTA 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Gly Pro Pro Ala Met His Pro Met Pro Thr Thr Pro Cys Thr Leu Thr 
15 10 15 

Cys Arg Asn He Trp Glu Pro Thr Ser Ser Leu Pro Thr Gin Cys Pro 
20 25 30 

Leu Thr Ser Ala Thr Asp Arg Cys Val Ser Gly Pro Asp Pro Gly Asp 
35 40 45 

Arg Gin 
50 
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(2) INFORMATION FOR SEQ ID NO: 54: 



10 



20 



40 



60 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1103 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: join (289 . . 429 , 433.. 477, 481.. 492, 496.. 603, 607 
15 ..630, 634. .750, 754. .810, 814. .843, 847. .1023, 

1027.. 1071, 1075.. 1103) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

TTTGGGAGAA GCAGTCCAAG TCTGCATATC AAATAAATGA TGGAGGAGAT GGGTGGTAGG 60 

3 ACCTTCCAGA CCTCATAAAA CTTAGGCTTT ATGATCTGGG ACTCACAGAA GGTTGAGCAA 120 

25 TAAAAGACCT TAGGGATTAT CTGGCTTAAT TAATTCTCTC ATTTTATAGA GGAAGAAATT 180 

;;p AAGTCAAGGT GGGGCAGGGT GGGAGGGGAG AACTTTCCCG GGGCTCTTCA TTTACTCCCA 240 

: r. ais 

O CAAAGGCTGG AATTTTGAGC AGCCCCTGTC TGTCTGTTTG TCCTTCCA GCC ACC CCT 297 

3$ Ala Thr Pro 

r 1 

p GAG ACC CCA CAG CCC TCA CCG CCA GGT GGC TCA GGG TCT GAG CCC TAT 3 45 

M b Glu Thr Pro Gin Pro Ser Pro Pro Gly Gly Ser Gly Ser Glu Pro Tyr 
If 5 10 15 

AAG CTC CTG CCG GGA GCC GTC GCC ACA ATC GTC AAG CCC CTC TCT GCC 393 
w Lys Leu Leu Pro Gly Ala Val Ala Thr lie Val Lys Pro Leu Ser Ala 

20 25 30 35 



ATC CCC CAG CCG ACC ATC ACC AAG CAG GAA GTT ATC TAG CAA GCC GCT 441 
He Pro Gin Pro Thr He Thr Lys Gin Glu Val He Gin Ala Ala 
40 45 50 



45 GGG GCT TGG GGG CTC CAC TGG CTC CCC CCA GCC CCC TAA GAG AGC ACC 489 

Gly Ala Trp Gly Leu His Trp Leu Pro Pro Ala Pro Glu Ser Thr 

55 60 65 

TGG TGA TCA CGT GGT CAC GGC AAA GGA AGA CGT GAT GCC AGG ACC AGT 537 
50 Trp Ser Arg Gly His Gly Lys Gly Arg Arg Asp Ala Arg Thr Ser 

70 75 80 

CCC AGA GCA GGA ATG GGA AGG ATG AAG GGC CCG AGA ACA TGG CCT AAG 585 
Pro Arg Ala Gly Met Gly Arg Met Lys Gly Pro Arg Thr Trp Pro Lys 
55 85 90 95 

GCA CAT CCC ACT GCA CCC TGA CGC CCT GCT CTG ATA ACA AGA CTT 630 
Ala His Pro Thr Ala Pro Arg Pro Ala Leu He Thr Arg Leu 

100 105 110 



TGA CTT GGG GAG ACC CTC TAC TGC CTT GGA CAA CTT TCT CAT GTT GAA 678 
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Leu Gly Glu Thr Leu Tyr Cys Leu Gly Gin Leu Ser His Val Glu 
115 120 125 



GCC ACT GCC TTC ACC TTC ACC TTC ATC CAT GTC CAA CCC CCG ACT TCA 72 6 

5 Ala Thr Ala Phe Thr Phe Thr Phe lie His Val Gin Pro Pro Thr Ser 

130 135 140 

TCC CAA AGG ACA GCC GCC TGG AGA TGA CTT GAG CCT TAC TTA AAC CCA 774 
Ser Gin Arg Thr Ala Ala Trp Arg Leu Glu Pro Tyr Leu Asn Pro 

10 145 150 155 

GCT CCC TTC TTC CCT AGC CTG GTG CTT CTC CTC TCC TAG CCC CGG TCA 822 
Ala Pro Phe Phe Pro Ser Leu Val Leu Leu Leu Ser Pro Arg Ser 

160 165 170 

15 

TGG TGT CCA GAC AGA GCC CTG TGA GGC TGG GTC CAA TTG TGG CAC TTG 870 
Trp Cys Pro Asp Arg Ala Leu Gly Trp Val Gin Leu Trp His Leu 

175 180 185 

20 GGG CAC CTT GCT CCT CCT TCT GCT GCT GCC CCC ACC TCT GCT GCC TCC 918 

Gly His Leu Ala Pro Pro Ser Ala Ala Ala Pro Thr Ser Ala Ala Ser 
rn 190 195 200 

• CTC TGC TGT CAC CTT GCT CAG CCA TCC CGT CTT CTC CAA CAC CAC CTC 966 

25 Leu Cys Cys His Leu Ala Gin Pro Ser Arg Leu Leu Gin His His Leu 
m 205 210 215 

y, TAC AGA GGC CAA GGA GGC CTT GGA AAC GAT TCC CCC AGT CAT TCT GGG 1014 

Tyr Arg Gly Gin Gly Gly Leu Gly Asn Asp Ser Pro Ser His Ser Gly 
|p 220 225 230 

} \ AAC ATG TTG TAA GCA CTG ACT GGG ACC AGG CAC CAG GCA GGG TCT AGA 1062 

O Asn Met Leu Ala Leu Thr Gly Thr Arg His Gin Ala Gly Ser Arg 

U 235 240 245 

'35 

; ? r= AGG CTG TGG TGA GGG AAG ACG CCT TTC TCC TCC AAC CCA AC 1103 

!, ;Z? Arg Leu Trp Gly Lys Thr Pro Phe Ser Ser Asn Pro 

M 250 255 260 



40 



(2) INFORMATION FOR SEQ ID NO; 55: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 261 amino acids 
45 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Ala Thr Pro Glu Thr Pro Gin Pro Ser Pro Pro Gly Gly Ser Gly Ser 
15 10 15 

55 Glu Pro Tyr Lys Leu Leu Pro Gly Ala Val Ala Thr lie Val Lys Pro 

20 25 30 



60 



Leu Ser Ala lie Pro Gin Pro Thr lie Thr Lys Gin Glu Val lie Gin 
35 40 45 

Ala Ala Gly Ala Trp Gly Leu His Trp Leu Pro Pro Ala Pro Glu Ser 



-239- 



A: l2iY16l(2L6H(n\.DOC) 



50 55 60 

Thr Trp Ser Arg Gly His Gly Lys Gly Arg Arg Asp Ala Arg Thr Ser 
65 70 75 80 

Pro Arg Ala Gly Met Gly Arg Met Lys Gly Pro Arg Thr Trp Pro Lys 
85 90 95 

Ala His Pro Thr Ala Pro Arg Pro Ala Leu lie Thr Arg Leu Leu Gly 
100 105 - 110 

Glu Thr Leu Tyr Cys Leu Gly Gin Leu Ser His Val Glu Ala Thr Ala 
115 120 125 

Phe Thr Phe Thr Phe lie His Val Gin Pro Pro Thr Ser Ser Gin Arg 
130 135 140 

Thr Ala Ala Trp Arg Leu Glu Pro Tyr Leu Asn Pro Ala Pro Phe Phe 
145 150 155 160 

Pro Ser Leu Val Leu Leu Leu Ser Pro Arg Ser Trp Cys Pro Asp Arg 
165 170 175 

Ala Leu Gly Trp Val Gin Leu Trp His Leu Gly His Leu Ala Pro Pro 
180 185 190 

Ser Ala Ala Ala Pro Thr Ser Ala Ala Ser Leu Cys Cys His Leu Ala 
195 200 205 

Gin Pro Ser Arg Leu Leu Gin His His Leu Tyr Arg Gly Gin Gly Gly 
210 215 220 

Leu Gly Asn Asp Ser Pro Ser His Ser Gly Asn Met Leu Ala Leu Thr 
225 230 235 240 

Gly Thr Arg His Gin Ala Gly Ser Arg Arg Leu Trp Gly Lys Thr Pro 
245 250 255 

Phe Ser Ser Asn Pro 
260 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
GGGCACTGGG AGGAGGCAGT 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 
GCCTGTAGGA CCAACCTACC 



{2V INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 

TCTGGTGTGC ACGACTGCAC 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59 
CTGGAGCTGC AGCCTCATAC 



(2) INFORMATION FOR SEQ ID NO: 60: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 
AAGGCTCCCT TAGATGCCTG 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 
CCACTCAGGG AGAAGACAGA CCT 
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(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62 
CCTAGTTCTG TCCTAAGAGG 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63 
GTCATAAAGT GTGGCTACAG 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64 
CCACCCCCTA CTCCATCCCT GT 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65 
CCCTCCCGTC AGCTGCTCCA 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66 
GTGCAGGGGA CAGAGAATGC 



{2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67 
AATCAAGCCA GTCCACGGCT AT 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68 
GCCCAGCGTC ACTGAGTTGG CTA 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69 
TTGCCTGGGT GAGTGCCATG 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70 
GCACCAGCTA TCTTGCCAAC 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71 
AGGAGAAGTC TGGCAGAGCG 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 0 base pairs 
<B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72 

CTCCTTGTGT GACACAAGTC 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73 
CTCACTGTGT GAGGCCTGTC 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74 
TGGTTGATTG GCCACGCCTG 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75 
ATCCTGGTTC TACCTTCTAG 



A: 12076! (2L6HOM.DOC) 



-244- 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
CATTTACTCC CACAAAGGCT 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: 
GACCACGTGA TCACCAGGTG 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1441 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 20.. 1414 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: 

CTCCAAAACC CTCGTCGAC ATG GAC ATG GCC GAC TAC AGT GCT GCA CTG GAC 

Met Asp Met Ala Asp Tyr Ser Ala Ala Leu Asp 
15 10 

CCA GCC TAC ACC ACC CTG GAA TTT GAG AAT GTG CAG GTG TTG ACG ATG 
Pro Ala Tyr Thr Thr Leu Glu Phe Glu Asn Val Gin Val Leu Thr Met 
15 20 25 

GGC AAT GAC ACG TCC CCA TCA GAA GGC ACC AAC CTC AAC GCG CCC AAC 
Gly Asn Asp Thr Ser Pro Ser Glu Gly Thr Asn Leu Asn Ala Pro Asn 
30 35 40 

AGC CTG GGT GTC AGC GCC CTG TGT GCC ATC TGC GGG GAC CGG GCC ACG 
Ser Leu Gly Val Ser Ala Leu Cys Ala lie Cys Gly Asp Arg Ala Thr 
45 50 55 

GGC AAA CAC TAC GGT GCC TCG AGC TGT GAC GGC TGC AAG GGC TTC TTC 
Gly Lys His Tyr Gly Ala Ser Ser Cys Asp Gly Cys Lys Gly Phe Phe 
60 65 70 75 
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CGG AGG AGC GTG CGG AAG AAC CAC ATG TAC TCC TGC AGA TTT AGC CGG 
Arg Arg Ser Val Arg Lys Asn His Met Tyr Ser Cys Arg Phe Ser Arg 
80 85 90 

CAG TGC GTG GTG GAC AAA GAC AAG AGG AAC CAG TGC CGC TAC TGC AGG 
Gin Cys Val Val Asp Lys Asp Lys Arg Asn Gin Cys Arg Tyr Cys Arg 
95 100 105 

CTC AAG AAA TGC TTC CGG GCT GGC ATG AAG AAG GAA GCC GTC CAG AAT 
Leu Lys Lys Cys Phe Arg Ala Gly Met Lys Lys Glu Ala Val Gin Asn 
110 115 120 

GAG CGG GAC CGG ATC AGC ACT CGA AGG TCA AGC TAT GAG GAC AGC AGC 
Glu Arg Asp Arg lie Ser Thr Arg Arg Ser Ser Tyr Glu Asp Ser Ser 
125 130 135 

CTG CCC TCC ATC AAT GCG CTC CTG CAG GCG GAG GTC CTG TCC CGA CAG 
Leu Pro Ser lie Asn Ala Leu Leu Gin Ala Glu Val Leu Ser Arg Gin 
140 145 150 155 

ATC ACC TCC CCC GTC TCC GGG ATC AAC GGC GAC ATT CGG GCG AAG AAG 
lie Thr Ser Pro Val Ser Gly lie Asn Gly Asp lie Arg Ala Lys Lys 
160 165 170 

ATT GCC AGC ATC GCA GAT GTG TGT GAG TCC ATG AAG GAG CAG CTG CTG 
lie Ala Ser lie Ala Asp Val Cys Glu Ser Met Lys Glu Gin Leu Leu 
175 180 185 

GTT CTC GTT GAG TGG GCC AAG TAC ATC CCA GCT TTC TGC GAG CTC CCC 
Val Leu Val Glu Trp Ala Lys Tyr lie Pro Ala Phe Cys Glu Leu Pro 
190 195 200 

CTG GAC GAC CAG GTG GCC CTG CTC AGA GCC CAT GCT GGC GAG CAC CTG 
Leu Asp Asp Gin Val Ala Leu Leu Arg Ala His Ala Gly Glu His Leu 
205 210 215 

CTG CTC GGA GCC ACC AAG AGA TCC ATG GTG TTC AAG GAC GTG CTG CTC 
Leu Leu Gly Ala Thr Lys Arg Ser Met Val Phe Lys Asp Val Leu Leu 
220 225 230 235 

CTA GGC AAT GAC TAC ATT GTC CCT CGG CAC TGC CCG GAG CTG GCG GAG 

Leu Gly Asn Asp Tyr lie Val Pro Arg His Cys Pro Glu Leu Ala Glu 
240 245 250 

ATG AGC CGG GTG TCC ATA CGC ATC CTT GAC GAG CTG GTG CTG CCC TTC 
Met Ser Arg Val Ser lie Arg lie Leu Asp Glu Leu Val Leu Pro Phe 
255 260 265 

CAG GAG CTG CAG ATC GAT GAC AAT GAG TAT GCC TAC CTC AAA GCC ATC 
Gin Glu Leu Gin lie Asp Asp Asn Glu Tyr Ala Tyr Leu Lys Ala lie 
270 275 280 

ATC TTC TTT GAC CCA GAT GCC AAG GGG CTG AGC GAT CCA GGG AAG ATC 
lie Phe Phe Asp Pro Asp Ala Lys Gly Leu Ser Asp Pro Gly Lys lie 
285 290 295 

AAG CGG CTG CGT TCC CAG GTG CAG GTG AGC TTG GAG GAC TAC ATC AAC 
Lys Arg Leu Arg Ser Gin Val Gin Val Ser Leu Glu Asp Tyr lie Asn 
300 305 310 315 
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GAC CGC CAG TAT GAC TCG CGT GGC CGC TTT GGA GAG CTG CTG CTG CTG 
Asp Arg Gin Tyr Asp Ser Arg Gly Arg Phe Gly Glu Leu Leu Leu Leu 
320 325 330 

CTG CCC ACC TTG CAG AGC ATC ACC TGG CAG ATG ATC GAG CAG ATC CAG 
Leu Pro Thr Leu Gin Ser lie Thr Trp Gin Met lie Glu Gin lie Gin 
335 340 345 

TTC ATC AAG CTC TTC GGC ATG GCC AAG ATT GAC AAC CTG TTG CAG GAG 
Phe He Lys Leu Phe Gly Met Ala Lys I-le Asp Asn Leu Leu Gin Glu 
350 355 360 

ATG CTG CTG GGA GGG TCC CCC AGC GAT GCA CCC CAT GCC CAC CAC CCC 
Met Leu Leu Gly Gly Ser Pro Ser Asp Ala Pro His Ala His His Pro 
365 370 375 

CTG CAC CCT CAC CTG ATG CAG GAA CAT ATG GGA ACC AAC GTC ATC GTT 
Leu His Pro His Leu Met Gin Glu His Met Gly Thr Asn Val He Val 
380 385 390 395 

GCC AAC ACA ATG CCC ACT CAC CTC AGC AAC GGA CAG ATG TGT GAG TGG 
Ala Asn Thr Met Pro Thr His Leu Ser Asn Gly Gin Met Cys Glu Trp 
400 405 410 

CCC CGA CCC AGG GGA CAG GCA GCC ACC CCT GAG ACC CCA CAG CCC TCA 
Pro Arg Pro Arg Gly Gin Ala Ala Thr Pro Glu Thr Pro Gin Pro Ser 
415 420 425 

CCG CCA GGT GCG TCA GGG TCT GAG CCC TAT AAG CTC CTG CCG GGA GCC 
Pro Pro Gly Ala Ser Gly Ser Glu Pro Tyr Lys Leu Leu Pro Gly Ala 
430 435 440 

GTC GCC ACA ATC GTC AAG CCC CTC TCT GCC ATC CCC CAG CCG ACC ATC 
Val Ala Thr He Val Lys Pro Leu Ser Ala He Pro Gin Pro Thr He 
445 450 455 

ACC AAG CAG GAA GTT ATC TAGCAAGCCG CTGGGGCTTG GGGGCTC 
Thr Lys Gin Glu Val He 
460 465 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 465 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:79: 

Met Asp Met Ala Asp Tyr Ser Ala Ala Leu Asp Pro Ala Tyr Thr Thr 
15 10 15 

Leu Glu Phe Glu Asn Val Gin Val Leu Thr Met Gly Asn Asp Thr Ser 
20 25 30 

Pro Ser Glu Gly Thr Asn Leu Asn Ala Pro Asn Ser Leu Gly Val Ser 
35 40 45 
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Ala Leu Cys Ala lie Cys Gly Asp Arg Ala Thr Gly Lys His Tyr Gly 
50 55 60 

Ala Ser Ser Cys Asp Gly Cys Lys Gly Phe Phe Arg Arg Ser Val Arg 
65 70 75 80 

Lys Asn His Met Tyr Ser Cys Arg Phe Ser Arg Gin Cys Val Val Asp 
85 90 95 

Lys Asp Lys Arg Asn Gin Cys Arg Tyr Cys Arg Leu Lys Lvs Cys Phe 
100 105 110 

Arg Ala Gly Met Lys Lys Glu Ala Val Gin Asn Glu Arg Asp Arg lie 
115 120 125 

Ser Thr Arg Arg Ser Ser Tyr Glu Asp Ser Ser Leu Pro Ser He Asn 
130 135 140 

Ala Leu Leu Gin Ala Glu Val Leu Ser Arg Gin He Thr Ser Pro Val 
145 150 155 160 

Ser Gly He Asn Gly Asp He Arg Ala Lys Lys He Ala Ser He Ala 
165 170 175 

Asp Val Cys Glu Ser Met Lys Glu Gin Leu Leu Val Leu Val Glu Trp 
180 185 190 

Ala Lys Tyr He Pro Ala Phe Cys Glu Leu Pro Leu Asp Asp Gin Val 
195 200 205 

Ala Leu Leu Arg Ala His Ala Gly Glu His Leu Leu Leu Gly Ala Thr 
210 215 220 

Lys Arg Ser Met Val Phe Lys Asp Val Leu Leu Leu Gly Asn Asp Tyr 
225 230 235 240 

He Val Pro Arg His Cys Pro Glu Leu Ala Glu Met Ser Arg Val Ser 
245 250 255 

He Arg He Leu Asp Glu Leu Val Leu Pro Phe Gin Glu Leu Gin He 
260 265 270 

Asp Asp Asn Glu Tyr Ala Tyr Leu Lys Ala He He Phe Phe Asp Pro 
275 280 285 

Asp Ala Lys Gly Leu Ser Asp Pro Gly Lys He Lys Arg Leu Arg Ser 
290 295 300 

Gin Val Gin Val Ser Leu Glu Asp Tyr He Asn Asp Arg Gin Tyr Asp 
305 310 315 320 

Ser Arg Gly Arg Phe Gly Glu Leu Leu Leu Leu Leu Pro Thr Leu Gin 
325 330 335 

Ser He Thr Trp Gin Met He Glu Gin He Gin Phe He Lys Leu Phe 
340 345 350 

Gly Met Ala Lys He Asp Asn Leu Leu Gin Glu Met Leu Leu Gly Gly 
355 360 365 

Ser Pro Ser Asp Ala Pro His Ala His His Pro Leu His Pro His Leu 
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370 375 380 

Met Gin Glu His Met Gly Thr Asn Val lie Val Ala Asn Thr Met Pro 
385 390 395 400 

Thr His Leu Ser Asn Gly Gin Met Cys Glu Trp Pro Arg Pro Arg Gly 
405 410 415 

Gin Ala Ala Thr Pro Glu Thr Pro Gin Pro Ser Pro Pro Gly Ala Ser 
420 425 430 

Gly Ser Glu Pro Tyr Lys Leu Leu Pro Gly Ala Val Ala Thr lie Val 
435 440 445 

Lys Pro Leu Ser Ala He Pro Gin Pro Thr He Thr Lys Gin Glu Val 
450 455 460 

He 
465 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2329 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
GGGGCCCTGA TTCACGGGCC GCTGGGGCAG GGTTGGGGGT TGGGGGTGCC CACAGGGTTG 
GCTAGTGGGG TTTTGGGGGG GCAGTGGGTG CAAGGAGTTT GGTTTGTGTC TGCCGGCCGG 
CAGGCAAACG CAACCACGCG GTGGGGGAGG CGGCTAGCGT GGTGGACGGC CCGCGTGGCC 
CTGTGGCAGC CGAGCCATGG TTTCTAAACT GAGCCAGCTG CAGACGGAGC TCCTGGCGGC 
CCTGCTCGAG TCAGGGCTGA GCAAAGAGGC ACTGATCCAG GCACTGGGTG AGCCGGGGCC 
CTACCTCCTG GCTGGAGAAG GCCCCCTGGA CAAGGGGGAG TCCTGCGGCG GCGGTCGAGG 
GGAGCTGGCT GAGCTGCCCA ATGGGCTGGG GGAGACTCGG GGCTCCGAGG ACGAGACGGA 
CGACGATGGG GAAGACTTCA CGCCACCCAT CCTCAAAGAG CTGGAGAACC TCAGCCCTGA 
GGAGGCGGCC CACCAGAAAG CCGTGGTGGA GACCCTTCTG CAGGAGGACC CGTGGCGTGT 
GGCGAAGATG GTCAAGTCCT ACCTGCAGCA GCACAACATC CCACAGCGGG AGGTGGTCGA 
TACCACTGGC CTCAACCAGT CCCACCTGTC CCAACACCTC AACAAGGGCA CTCCCATGAA 
GACGCAGAAG CGGGCCGCCC TGTACACCTG GTACGTCCGC AAGCAGCGAG AGGTGGCGCA 
GCAGTTCACC CATGCAGGGC AGGGAGGGCT GATTGAAGAG CCCACAGGTG ATGAGCTACC 
AACCAAGAAG GGGCGGAGGA ACCGTTTCAA GTGGGGCCCA GCATCCCAGC AGATCCTGTT 
CCAGGCCTAT GAGAGGCAGA AGAACCCTAG CAAGGAGGAG CGAGAGACGC TAGTGGAGGA 
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GTGCAATAGG GCGGAATGCA TCCAGAGAGG GGTGTCCCCA TCACAGGCAC AGGGGCTGGG 960 

CTCCAACCTC GTCACGGAGG TGCGTGTCTA CAACTGGTTT GCCAACCGGC GCAAAGAAGA 1020 

AGCCTTCCGG CACAAGCTGG CCATGGACAC GTACAGCGGG CCCCCCCCAG GGCCAGGCCC 1080 

GGGACCTGCG CTGCCCGCTC ACAGCTCCCC TGGCCTGCCT CCACCTGCCC TCTCCCCCAG 1140 

TAAGGTCCAC GGTGTGCGCT ATGGACAGCC TGCGACCAGT GAGACTGCAG AAGTACCCTC 1200 

AAGCAGCGGC GGTCCCTTAG TGACAGTGTC TACACCCCTC CACCAAGTGT CCCCCACGGG 1260 

CCTGGAGCCC AGCCACAGCC TGCTGAGTAC AGAAGCCAAG CTGGTCTCAG CAGCTGGGGG 1320 

15 CCCCCTCCCC CCTGTCAGCA CCCTGACAGC ACTGCACAGC TTGGAGCAGA CATCCCCAGG 1380 

CCTCAACCAG CAGCCCCAGA ACCTCATCAT GGCCTCACTT CCTGGGGTCA TGACCATCGG 1440 

GCCTGGTGAG CCTGCCTCCC TGGGTCCTAC GTTCACCAAC ACAGGTGCCT CCACCCTGGT 1500 

20 

CATCGGCCTG GCCTCCACGC AGGCACAGAG TGTGCCGGTC ATCAACAGCA TGGGCAGCAG 1560 

CCTGACCACC CTGCAGCCCG TCCAGTTCTC CCAGCCGCTG CACCCCTCCT ACCAGCAGCC 1620 

25 GCTCATGCCA CCTGTGCAGA GCCATGTGAC CCAGAGCCCC TTCATGGCCA CCATGGCTCA 1680 

GCTGCAGAGC CCCCACGGTG AGCACCCTGT GCCCCACACA GCAGGAGATG ATGATAGAGG 1740 

TTGGCTGTCA ATGGATGCAG GGGAAAGGGG TGCCTGGCAG GCATTGCAGT CTGCATGTGT 1800 

30 

CTCTGGGACA AGTGTTTTTC CGTGATTGAG GGTGTC TGC A GGCCAGTGTG TTCCCATGTG 1860 

AATGCACGTA TCTGTGTGTG TGCACGACTG CTTGTGTGAG CAGATCCCTA GTCGTGTCTG 192 0 

35 GGTGTGTATC GGTTGTGCAT GCATTTGTGT GCATCCTGTG TTTCTCTGAA ACTCTTAGGG 1980 

CCATATGAAT TTC TAAAATC TATTCAGATT TTAGAAAGGT AATCTGGGGC CAGGCGTGGT 2040 

GGCTCATGCC TGTAATCCCA GCACTTTGGA AGGCCGAGGT GGGCAGATCA CTTGAGGTCA 2100 

40 

GGAGTTCAAG ACCAGCCTGG CCAACACGGT GAAACCCCGT CTCTACTAAA AGTACAAAAA 2160 

TTAGCCAGGC GTGGAGCACG TGCCTGTAGT CCCAGCTACT TGGGAGGCTG AGGCAGAATC 2220 

45 GCTTGAACCT GGGAGGCGGA GGTTGCAGTG AGCTGAGATT TGGCCACTGC ACTGCACTCC 22 80 

AGCCTGGGCA ACAGAGTGAG TACTCTGCCA AAAAAAAAAA AAAAAAAAA 2329 

50 (2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

60 CACCTGGTGA TCACGTGGTC 20 
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(2) INFORMATION FOR SEQ ID NO: 82: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID-NO:82 
GTAAGGCTCA AGTCATCTCC 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 5 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83 

Glu Gly Cys Lys Gly 
1 5 



(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84 

Glu Gly Cys Lys Ala 
1 5 



(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
{ D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85 

Asp Gly Cys Lys Gly 
1 5 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1. .36 



(xi) SEQUENCE DESCRIPTION: SEQ ID~NO:86: 

GAC ACG TAC AGC GGC CCC CCC CCA GGG CCA GGC CCG 
Asp Thr Tyr Ser Gly Pro Pro Pro Gly Pro Gly Pro 
15 10 



(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Asp Thr Tyr Ser Gly Pro Pro Pro Gly Pro Gly Pro 
15 10 



(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..3 6 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

GAC ACG TAC AGC GGC CCC CCC CCC AGG GCC AGG CCC 
Asp Thr Tyr Ser Gly Pro Pro Pro Arg Ala Arg Pro 
15 10 



(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Asp Thr Tyr Ser Gly Pro Pro Pro Arg Ala Arg Pro 
15 10 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
CATGAACCCC GAAGAGTGGT G 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
GCCTCCAGAC ACCTGTTACT 



(2) INFORMATION FOR SEQ ID NO: 92; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
GGCGATCATG GCAAGTTAGA AG 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 
TTGGTGAGAG TATGGAAGAC C 
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(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94 
GGGGTTTGCT TGTGAAACTC C 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95 
TTGGTGGGAA ACGGGCTTGG 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96 
CTCCCACTAG TACCCTAACC 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97 
GAGAGGGCAA AGGTCACTTC AG 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
AGTGAAGGCT ACAGACCCTA TC 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
TTCCTGGGTC TGTGTACTTG C 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100 

TGTGTTTTGG GCCAAGCACC A 



(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101 
AACCAGATAA GATCCGTGGC 



(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102 
AACCAGACTC ACAGCCTGAA CC 



(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103 
TCACAGGGCA ATGGCTGAAC 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104 

TGCCGAGTCA TTGTTCCAGG 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105 
CCTCTTATCT TATCAGCTCC AG 



(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106 
CTGCTCTTTG TGGTCCAAGT CC 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107 
GAGTTTGAAG GAGACCTACA G 
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(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
ATCCACCTCT CCTTATCCCA G 



(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109 
ACTTCCGAGA AAGTTCAGAC C 



(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110 
TTTGCCTGTG TATGCACCTT G 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111 
GCCGAGTCCA TGCTTGCCAC 



(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112 
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CTTTGCTGGT TGAGTTGGGC 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113 
TTCCATGACA GCTGCCCAGA G 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114 
TAAAGGTTGG AGCCCCTCTG 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115 
TTGTAAGGTG ACCCCATCAG 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116 
TTGGTGATGT CCAGAAGTCC 



(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
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(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117 
CAGAATGTGT CAGAGTTCGC 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118 
CTCCCTCCTG TTCTTAAGTG 



(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119 
CTGGACTCCC AGTTCAGTCA 



(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:120 

CAAGGATCCA GAAGATTGGC 



(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121 
CGTCCTCTGG GAAGATCTGC 
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(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 4 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122 

GCAACAGAGC AAGACTCCAT CTCA 



(2) INFORMATION FOR SEQ ID NO:123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123 
GAGTTTAATG GAAGAACTAA CC 



(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124 

CCTCATGGAG AAACATCCTA AGT 



(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125 
AGGGAGTGCA CGGCTGAGCT CCTG 



(2) INFORMATION FOR SEQ ID NO: 12 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6254 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME /KEY : modi f iedjbase 

(B) LOCATION: 1287.. 4273 

(D) OTHER INFORMATION: /note= "N = A or G or C or T" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 6: 

AGCCAGCACT GTTCTTGGCA CATGGTAATC TTAACATATT TTTTCCTACA GGGAGGCCTG 60 

GTGTCAGGCC GGGAGTGGGG TGGAAGGGTC CCAAAATGGA TGGAAGGGCC CCAAAATGGC 120 

CGTGAGCATC CTCTGCCCTT GAGAAGAGCT AGCCCAGCTG TCTAGAGCTC CCTGCTGCTG 18 0 

15 CCGCTCTCGT AAGCAGCAAG CATTTTTGGC TCTCCTGTCT CAGCATGATG CCCCTACAAG 240 

GTTCTTTCGG GGGTGGGACC CAACGCTGCT CTCCTGATGG CCTCCCTGGC TCCCAGCACC 300 

TTCCATCCCA GCTGCTCAGG GCCCCTCACC TGCGCCTCCC CCACCCTCCC CTCTGCCCAC 3 60 

TCCCATCGCA GGCCATAGCT CCCTGTCCCT CTCCGCTGCC ATGAGGCCTG CACTTTGCAG 420 

:.g GGCTGAAGTC CAAAGTTCAG TCCCTTCGCT AAGCACACGG ATAAATATGA ACCTTGGAGA 480 

|| ATTTCCCCAG CTCCAATGTA AACAGAACAG GCAGGGGCCC TGATTCACGG GCCGCTGGGG 540 

4* CCAGGGTTGG GGGTTGGGGG TGCCCACAGG GCTTGGCTAG TGGGGTTTTG GGGGGGCAGT 600 

GGGTGCAAGG AGTTTGGTTT GTGTCTGCCG GCCGGCAGGC AAACGCAACC CACGCGGTGG 660 

GGGAGGCGGC TAGCGTGGTG GACCCGGGCC GCGTGGCCCT GTGGCAGCCG AGCCATGGTT 720 

TCTAAACTGA GCCAGCTGCA GACGGAGCTC CTGGCGGCCC TGCTCGAGTC AGGGCTGAGC 780 

AAAGAGGCAC TGATCCAGGC ACTGGGTGAG CCGGGGCCCT ACCTCCTGGC TGGAGAAGGC 840 

CCCCTGGACA AGGGGGAGTC CTGCGGCGGC GGTCGAGGGG AGCTGGCTGA GCTGCCCAAT 900 

GGGCTGGGGG AGACTCGGGG CTCCGAGGAC GAGACGGACG ACGATGGGGA AGACTTCACG 960 

CCACCCATCC TCAAAGAGCT GGAGAACCTC AGCCCTGAGG AGGCGGCCCA CCAGAAAGCC 1020 

GTGGTGGAGA CCCTTCTGCA GTAAGGAGCC CTGCCCCGTC CCCGCTCCCA GGAGAGCCTA 1080 

45 GAGGGGCCCC CCTCAGCTCC TAACGAGCCC CCCTTCTGAG TTGAGTCCCC ATGACCTTCA 1140 

GCCTTTAGCC TAGTTGCTGG GAAGGGGGAC AGGGCCCATG AGAGCCCAGG GGTCCTTGCT 1200 

50 TGGAGGTTTG AGCCTCCAGC CCCTGAACTG CTCCTCTGCA GAGTCCCAAA TCCCATGAGC 1260 

CCAGGCCTTT AGCCCAGTCC TTGGGCNAGG GGGACATTTC CCAGGGGGTC CAAGATGGGA 1320 

GAAAAAGCAG TGAATTCACA ACTCAAATGC CCACCCACCC ATCCATCCAT CCGTCCATCC 1380 

55 ACCCATTCAT CCATTCATCC ATTCACCCAT CCATCCATCC ACATATCTTC ATCTGTGTTG 1440 

TGTGTCTGTG TATCCATGTT TCTAAACCTT TATCTGTTCC AGTGTCTGTA TCCATAGGCC 1500 

TGTGTCCACG TTTGTCATGT GTGTGCGTCN ACAAGTCTCT GTCCTCATGA CCATGTGTCT 1560 

GTGTCCCTGT GTCCTGGCAT AAATGACCAT ACCTCACCGT CCCTGAGTCT ATGTGTAGGC 162 0 
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CCCTGGGCTC CATAACTGCT TTCATGCACA GTCCCCACCC TCAGAGTTGA CAAGGTTCCA 1680 

GCACCCAGGA CCGCAGCCCC ACCTATGGGG AGAGACAGCC CTTGCTGAGC AGATCCCGTC 1740 

CTTGCCCTCT CCCAGGGAGG ACCCGTGGCG TGTGGCGAAG ATGGTCAAGT CCTACCTGCA 1800 

GCAGCACAAC ATCCCACAGC GGGAGGTGGT CGATACCACT GGCCTCAACC AGTCCCACCT 1860 

10 GTCCCAACAC CTCAACAAGG GCACTCCCAT GAAGACGCAG AAGCGGGCCG CCCTGTACAC 1920 

CTGGTACGTC CGCAAGCAGC GAGAGGTGGC GCAGCGTAAG TAATGACCCT ACCCCGCATC 1980 

TTCCCTGGGA GGGCCCAGGA CTCTCCCCTA ACTCATAGGT GGGGGCTGGA AGCTTCACCA 2040 

TCCCCATTAC ACAGACAGGT AGATGGAAAG GAAGTCAGTG GGATTCAACC TGCATTTATT 2100 

ACCTATTCTG CGCCAGGCAC TCTGTGGGAC GGGAGTANAC TTGGTCCTGA ACATCCAAAG 2160 

20 ATGAATGAAA TGGGTCCCTG CTTTCTTTTT CTTTTTTTAG ATACGTGACT CTGGAAAAAT 2220 

3 ATGTAAGCTC TCTGAGCCTC AGCTTCTTCA TCTGTACAAT GGGGATAGTA AATGTGCCAA 2280 

ATCAGAACAA ATGCTAATGC TTACCTGCAG TCTTGTACTG AGAAGGATGG TGAGATCATA 23 40 

IP 

JTj TCTTGGGTTG GTAGGAAAGC ATTCAGGGAT TGATTAGTGA TGTTTGCCTT GAACACAGGT 2400 

^ TAAGAAAGTG ATGGCATGTG TGCTGTGTGT TTGTCATCAG TAGATTAGAT GATTTCTAAG 2460 

;g0 TTCTAGCTGT AAGCTCCTCT GGTTCAGCGC CATGGCAATG AGAAAGAATC AAGGGCAAGG 2520 

^ TCAGGGGAAT GGACGAGGGA AGGTGAGAGT GGCCAGTACC CCACTCACGG CTTTCTGTGC 2580 

*| CTGCAGAGTT CACCCATGCA GGGCAGGGAG GGCTGATTGA AGAGCCCACA GGTGATGAGC 2640 

^ TACCAACCAA GAAGGGGCGG AGGAACCGTT TCAAGTGGGG CCCAGCATCC CAGCAGATCC 2700 

*** TGTTCCAGGC CTATGAGAGG CAGAAGAACC CTAGCAAGGA GGAGCGAGAG GTACAACGGC 2 7 60 

40 GGGCGGGAAA CAGTGCTGGT TTGGTCTGGG CTGCGGCAAG GCCAGGGGAA GGGGAAGGTG 2 82 0 

ACTCTAGGTC CTGTAAAAGG CTGTCCAGTT GCCGAGAACT CCTGATATTG GCTTAGCCTG 2880 

^ GCCCAGAAAA TTGAGAATAC TTGAACCTAA GCCCATTCCT CGCAGCCCCC CTGCACCNTG 2940 

GACACCAAGC AACCCCTTCC ATGGATGCTC ACCCAATTCG ATTCTCTCTA CAATCCTATG 3 000 

GCTCTTTTGC TCACTTTATG AATGGAGAGA CTGAGGTCAG ACAGACTGTC AATTGCCCAA 3 060 

50 GGTCACACAG CAGACCTGGC ATTGGAACCC AGATCTGCCA GCCTCAAACC CTCCGGCAGA 3120 

GNTCAGCTTC TCAGAACCCT CCCCTTCATG CCCAGGACAG GGTTCCTCTG AGCCTGGCCT 3180 

^ GGAGGCTCAT GGGTGGCTAT TTCTGCAGGG CGGAATGCAT CCAGAGAGGG GTGTCCCCAT 3240 

CACAGGCACA GGGGCTGGGC TCCAACCTCG TCACGGAGGT GCGTGTCTAC AACTGGTTTG 3 300 

CCAACCGGCG CAAAGAAGAA GCCTTCCGGC ACAAGCTGGC CATGGACACG TACAGCGGGC 3360 

60 CCCCCCCAGG GCCAGGCCCG GGACCTGCGC TGCCCGCTCA CAGCTCCCCT GGCCTGCCTC 3420 



A: 12076 l(2L6HM !.DOC) 



-262- 



10 



CACCTGCCCT CTCCCCCAGT AAGGTCCACG GTAAGTGGTA TGTGGGGACA AGGGACACGT 3480 

GGGAAGGTGG GAGGGTTGGG GAGGACTGTC CCATTGACAG CAGTCACCTA AACCTCTTTG 3 540 

CACGTCAGTT TGGTTCCATT CGCAGCTGAC CCAGGGATTG GCAAAAGGTA GAAACAAAGG 3 600 

CAGATTTGCT GGCTGCATAA AGGCAGACAG GCAGATGGCC TAAGCAAACC AATGGAGTTT 3660 

GAAGTGCTGA GGGCTGTGGA GGCAGGGGAG GGCAGGGAAG TGGGGTGCTG AGGCAGGACA 3720 

CTGCTTCCCT CTCCAGGTGT GCGCTATGGA CAGCCTGCGA CCAGTGAGAC TGCAGAAGTA 37 80 

CCCTCAAGCA GCGGCGGTCC CTTAGTGACA GTGTCTACAC CCCTCCACCA AGTGTCCCCC 3840 

15 ACGGGCCTGG AGCCCAGCCA CAGCCTGCTG AGTACAGAAG CCAAGCTGGT GAGTGTCCTT 3900 

GCTTGTAAGG AAAACCCAAC CTCATCTTTC CTTGGCAGGG AGATTCTGGA GCAGTCCCTA 3960 

GGGAGGCCCT GTGGGGACCC CGGCCCCCCG GACACAGCTT GGCTTCCCCT CGTAGGTCTC 4020 

20 

AGCAGCTGGG GGCCCCCTCC CCCCTGTCAG CACCCTGACA GCACTGCACA GCTTGGAGCA 4080 

P GACATCCCCA GGCCTCAACC AGCAGCCCCA GAACCTCATC ATGGCCTCAC TTCCTGGGGT 4140 

%|5 CATGACCATC GGGCCTGGTG AGCCTGCCTC CCTGGGTCCT ACGTTCACCA ACACAGGTGC 4200 

CTCCACCCTG GTCATCGGTA AGCTGGTGGG GATGGGTGGG CACCTGGGTG GGAGGCTCAT 4260 

C"! GGGGCAACCG CANAATCCAG GAGCTGGAAA AGCCACTGGG ACTCATTCAT TCATTCATTC 4320 

IT! ATTCATACAA CATGTTAGGA GAGGGGAGCA GAGAACTGAC CCCATGGCCT TTGCACTGCT 43 80 

r;| GTGGTACCCC AGGGCTCCAG GGAACCGCAG TTTGACAACT TTTGAACAAG TCACCGCTTG 4440 

j ,$5 CTTTTCCCAT TAGCTTAGAC AAAGAGCTAA AGGCTCAGAG AGGGGGAATG ACTTGCCAGA 4500 

UJ GCCACTTAAA TTAGTGGCAG GTCCCAGTGG AGGGCTGTTT CCTGACCACC TTGCCCCTTC 4560 

j, s * TTCCAAACCA CGGGCTCTGG GAAGGAGAGG TGGTGCCCTT GGGAGGTCTT GGGCAGGGGT 4620 

40 

GGGATATAAC TGGGGGGCCC AGCTGATTCC CTCCCCTTCC ACTCCAGGCC TGGCCTCCAC 4680 

GCAGGCACAG AGTGTGCCGG TCATCAACAG CATGGGCAGC AGCCTGACCA CCCTGCAGCC 4740 

45 CGTCCAGTTC TCCCAGCCGC TGCACCCCTC CTACCAGCAG CCGCTCATGC CACCTGTGCA 4800 

GAGCCATGTG ACCCAGAACC CCTTCATGGC CACCATGGCT CAGCTGCAGA GCCCCCACGG 4860 

TGAGCACCCT GTGCCCCACA CAGCAGGAGA TGATGATAGA GGTTGGCTGT CAATGGATGC 4920 

50 

AGGGGAAAGG GGTGCCTGGC AGGCATTGCA GTCTGCATGT GTCTCTGGGA CAAGTGTGTT 4980 

TCCGTGATTG AGGGTGTCTG CAGGCCAGTG TGTTCCCATG TGAATGCACG TATCTGTGTG 5040 

55 TGTGCACGAC TGCTTGTGTG AGCAGATCCC TAGTGCGTGT CTGGGTGTGT ATCGGTTGTG 5100 

CATGCATTTG TGTGCATGCC TGTGTTTCTC TGAAACTCTT AGGGCCATAT GAATTTCTAA 5160 

AATCTATTCA GACCAGTTTT GAAAATCAGC CTTGGATCTC CAACTGCTGC CCAGTCTGGC 5220 

TGTTCAGCAG GCCCCATGCC CCCCTTTCCC CAGTCTTGAG GCCTGGGACT AGGGCTGTCA 5280 
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GGCACGTTTG 


CCACGTCTGC 


CCCTCTCTCC 


CCTGCGGCCA 


GCCCTCTACA 


GPP APA APPP 




CGAGGTGGCC 


CAGTACACCC 


ACACGGGCCT 


GCTCCCGCAG 


ACTATGCTCA 


TPAPPGAPAP 


D 4t u u 


CACCAACCTG 


AGCGCCCTGG 


CCAGCCTCAC 


GCCCACCAAG 


CAGGTAAGGT 


CCAGGPPTGP 




TGGCCCTCCC 


TCGGCCTGTG 


ACAGAGCCCC 


TCACCCCCAC 


ATCCCCCGGG 


CTCAGGAGGP 


^ n 

J JZU 


TGCTCTGCTC 


CCCCAGGTCT 


TCACCTCAGA 


CACTGAGGCC 


TCCAGTGAGT 




^ c: o n 
-> J O U 


CACGCCGGCA 


TCTCAGGCCA 


CCACCCTCCA 


CGTCCCCAGC 


CAGGACCCTG 


PPGGPATPP A 


D04±U 


GCACCTGCAG 


CCGGCCCACC 


GGCTCAGCGC 


CAGCCCCACA 


GGTGAGAGGC 




j / UU 


CCCCCTCCCT 


TACTGTCCCT 


GCCCCCTTCC 


ATGTTGGTCC 


CACCCCTTCT 


g r r r rpTp r P(^ r pp p 


J / DU 


GTCACTGTGG 


GGCTGTGCAT 


GCAGCAGGCC 


TAGGGCTGCT 


GTGAGGAAGC 


A P fPi^iP a. P P P 


coon 


GTGGAAGGGT 


GGGGTGGCTT 


CCATGAATCC 


AGTGTTCACA 


GTAAGATGTA 


CTCAGGPPAG 


JOOU 


TCCATGGGCG 


GCCGTGGACC 


CTGGCTGGGA 


GGCTCCCTTT 


GTTAAGAACC 


GAGGGTAGAG 


D j f± U 


GTGTGACTTT 


GGGGTTCCTG 


TTATGTGCTG 


TGATCCAGGA 


GGTGTGGPPP 


TPPPHTPPr^r* a 


c n A n 


TCfTGAGTAP 


^-v-^ x nbbunL 




bCjlbGbTGTG 


GGTGCCTGGT 


GGGTGGCTAG 


6060 


CAGCCTTGTT 


TGCCTCTGCA 


GTGTCCTCCA 


GCAGCCTGGT 


GCTGTACCAG 


AGCTCAGACT 


6120 


CCAGCAATGG 


CCAGAGCCAC 


CTGCTGCCAT 


CCAACCACAG 


CGTCATCGAG 


ACCTTCATCT 


6180 


CCACCCAGAT 


GGCCTCTTCC 


TCCCAGTAAC 


CACGGCACCT 


GGGCCCTGGG 


GCCTGTACTG 


6240 


CCTGCTTGGG 


GGGT 
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(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 631 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Met Val Ser Lys Leu Ser Gin Leu Gin Thr Glu Leu Leu Ala Ala Leu 
15 10 15 

Leu Glu Ser Gly Leu Ser Lys Glu Ala Leu lie Gin Ala Leu Gly Glu 
20 25 30 

Pro Gly Pro Tyr Leu Leu Ala Gly Glu Gly Pro Leu Asp Lys Gly Glu 
35 40 45 

Ser Cys Gly Gly Gly Arg Gly Glu Leu Ala Glu Leu Pro Asn Gly Leu 
50 55 60 

Gly Glu Thr Arg Gly Ser Glu Asp Glu Thr Asp Asp Asp Gly Glu Asp 
65 70 75. 80 
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Phe Thr Pro Pro lie Leu Lys Glu Leu Glu Asn Leu Ser Pro Glu Glu 
85 90 95 

Ala Ala His Gin Lys Ala Val Val Glu Thr Leu Leu Gin Glu Asp Pro 
100 105 110 

Trp Arg Val Ala Lys Met Val Lys Ser Tyr Leu Gin Gin His Asn lie 
115 120 125 

Pro Gin Arg Glu Val Val Asp Thr Thr Gly Leu Asn Gin Ser His Leu 
130 135 140 

Ser Gin His Leu Asn Lys Gly Thr Pro Met Lys Thr Gin Lys Arg Ala 
145 150 155 160 

Ala Leu Tyr Thr Trp Tyr Val Arg Lys Gin Arg Glu Val Ala Gin Gin 
165 170 175 

Phe Thr His Ala Gly Gin Gly Gly Leu lie Glu Glu Pro Thr Gly Asp 
180 185 190 

Glu Leu Pro Thr Lys Lys Gly Arg Arg Asn Arg Phe Lys Trp Gly Pro 
195 200 205 

Ala Ser Gin Gin lie Leu Phe Gin Ala Tyr Glu Arg Gin Lys Asn Pro 
210 215 220 

Ser Lys Glu Glu Arg Glu Thr Leu Val Glu Glu Cys Asn Arg Ala Glu 
225 230 235 240 

Cys lie Gin Arg Gly Val Ser Pro Ser Gin Ala Gin Gly Leu Gly Ser 
245 250 255 

Asn Leu Val Thr Glu Val Arg Val Tyr Asn Trp Phe Ala Asn Arg Arg 
260 265 270 

Lys Glu Glu Ala Phe Arg His Lys Leu Ala Met Asp Thr Tyr Ser Gly 
275 280 285 

Pro Pro Pro Gly Pro Gly Pro Gly Pro Ala Leu Pro Ala His Ser Ser 
290 295 300 

Pro Gly Leu Pro Pro Pro Ala Leu Ser Pro Ser Lys Val His Gly Val 
305 310 315 320 

Arg Tyr Gly Gin Pro Ala Thr Ser Glu Thr Ala Glu Val Pro Ser Ser 
325 330 335 

Ser Gly Gly Pro Leu Val Thr Val Ser Thr Pro Leu His Gin Val Ser 
340 345 350 

Pro Thr Gly Leu Glu Pro Ser His Ser Leu Leu Ser Thr Glu Ala Lys 
355 360 365 

Leu Val Ser Ala Ala Gly Gly Pro Leu Pro Pro Val Ser Thr Leu Thr 
370 375 380 

Ala Leu His Ser Leu Glu Gin Thr Ser Pro Gly Leu Asn Gin Gin Pro 
385 ,390 395 400 

Gin Asn Leu lie Met Ala Ser Leu Pro Gly Val Met Thr He Gly Pro 
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405 410 415 

Gly Glu Pro Ala Ser Leu Gly Pro Thr Phe Thr Asn Thr Gly Ala Ser 
420 425 430 

Thr Leu Val lie Gly Leu Ala Ser Thr Gin Ala Gin Ser Val Pro Val 
435 440 445 

lie Asn Ser Met Gly Ser Ser Leu Thr Thr Leu Gin Pro Val Gin Phe 
450 455 - 460 



His Pro Ser Tyr Gin Gin 
470 

Thr Gin Asn Pro Phe Met 
485 490 

Ala Leu Tyr Ser His Lys 
505 

Leu Leu Pro Gin Thr Met 
520 

Leu Ala Ser Leu Thr Pro 
535 

Ala Ser Ser Glu Ser Gly 
550 

Leu His Val Pro Ser Gin 
565 570 

Ala His Arg Leu Ser Ala 
585 



Pro Leu Met Pro Pro Val 
475 480 

Ala Thr Met Ala Gin Leu 
495 

Pro Glu Val Ala Gin Tyr 
510 

Leu lie Thr Asp Thr Thr 
525 

Thr Lys Gin Val Phe Thr 
540 

Leu His Thr Pro Ala Ser 
555 560 

Asp Pro Ala Gly He Gin 
575 

Ser Pro Thr Val Ser Ser 
590 



Ser Gin Pro Leu 
465 

Gin Ser His Val 



Gin Ser Pro His 
500 

Thr His Thr Gly 
515 

Asn Leu Ser Ala 
530 

Ser Asp Thr Glu 
545 

Gin Ala Thr Thr 



His Leu Gin Pro 
580 

Ser Ser Leu Val 
595 

His Leu Leu Pro 
610 

Gin Met Ala Ser 
625 



Leu Tyr Gin Ser 
600 



Ser Asp Ser Ser 



Asn Gly Gin Ser 
605 



Ser Asn His Ser Val He Glu Thr Phe He Ser Thr 
615 620 

Ser Ser Gin 
630 



(2) INFORMATION FOR SEQ ID NO: 128; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6433 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:128: 
CATGAACCCC GAAGAGTAGT GTCTTCTCTC TGGACTAAAG CGGAACTGAG AACCGGTGGA 
AAAGCCCCGC GCCTAGGCTG CAAGGCACTG GCTTAACAAG TCCAAAGGTT AGGTGAAGTT 
TGGCTGATAA GCAGAACCAG TAAAAGAAGG TCTCTAGCCC CCCAGCGTGA GTACAATGGA 
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CCCTGGCAAA 


GCCCCGCTCC 


CGGCCCAGGT 


CTTCTGCTCT 


CCAGGTCTGC 


CCCTCCGGCT 


240 




CTCCCTCTCT 


CCGGGTTTCC 


CCCTCCCCAC 


CATCATTTGC 


ATCCAGCCGA AAGCTGGGCC 


300 


5 


CTTCCCACTA 


ATTTGCATAT 


CTTATATGGC 


CTAATGGTGG 


CGATCATGGC 


AAGTTAGAAG 


360 




TTTTCTGACT 


CCTTTCGGAG 


GAGCCTCCGG 


GACCCCGGGG 


AGTAACAGGT 


GTCTGGAGGC 


420 


10 


TGAAGGGTGG 


AGGGGTTCCT 


GGATTTGGGG 


TTTGCTTGTG 


AAACTCCCCT 


CCACCCTCCT 


480 




CTCTCGCACC 


CACCCACCCC 


CTCACCCCCT 


TCTTTTTCCG 


TCCTTGGAAA ATGGTGTCCA 


540 




AGCTCACGTC 


GCTCCAGCAA 


GAACTCCTGA 


GCGCCCTGCT 


GAGCTCCGGG 


GTCACCAAGG 


600 


15 


AGGTGCTGGT 


TCAGGCCTTG 


GAGGAGTTGC 


TGCCATCCCC 


GAACTTCGGG 


GTGAAGCTGG 


660 




AGACGCTGCC 


CCTGTCCCCT 


GGCAGCGGGG 


CCGAGCCCGA 


CACCAAGCCG 


GTCTTCCATA 


720 


20 


CTCTCACCAA 


CGGCCACGCC 


AAGGGCCGCT 


TGTCCGGCGA 


CGAGGGCTCC 


GAGGACGGCG 


780 




ACGACTATGA 


CACACCTCCC 


ATCCTCAAGG 


AGCTGCAGGC 


GCTCAACACC 


GAGGAGGCGG 


840 




CGGAGCAGCG 


GGCGGAGGTG 


GACCGGATGC 


TCAGGTAGGC 


GCAGAGCCAG 


GTGGAGGGGA 


900 


$5 


CCCACCCGAA 


CCCCTGGAGC 


CCCGGCCCCG 


GGCCTGAGTG 


ACACTGCGCC 


CGACCACACT 


960 


»P 


CGCCAAGCCC 


GTTTCCCACC 


AAAAAATTCC 


CCCGGGGGGC 


GCTCTGCTTC 


TCTCCCAACA 


1020 


? ^ 

So 


CCCGGACCCT 


TCCCAATCCC 


TTAGCGGGAC 


AACCCTGCGG 


CCCACCGGGC 


TTCTTCTCCC 


1080 




CAGGCCCAGG 


CCATCGTCCT 


CAGAAGAAAG 


GGATGAGGTG 


TACCGTACAG 


GGGCAGTCAC 


1140 




CTTCTCCTCT 


GTTTAGCTTC 


CATTTTGGCC 


TCATGTCTAC 


CCCAAAGTTG 


TAGCTTAGAT 


1200 


35 


GGGGGGAAAA 


TTCAGAATTT 


TGCATAGACC 


ATAGGTAGCA 


CCCCCTAGAA 


AAAGAATGTT 


1260 




TCTCCCCAGA 


TGTCTCCCAC 


TAGTACCCTA 


ACCATCTGCT 


TGTCTGTCTA 


GTGAGGACCC 


1320 


40 


TTGGAGGGCT 


GCTAAAATGA 


TCAAGGGTTA 


CATGCAGCAA 


CACAACATCC 


CCCAGAGGGA 


1380 


GGTGGTCGAT 


GTCACCGGCC 


TGAACCAGTC 


GCACCTCTCC 


CAGCATCTCA 


ACAAGGGCAC 


1440 




CCCTATGAAG 


ACCCAGAAGC 


GTGCCGCTCT 


GTACACCTGG 


TACGTCAGAA 


AGCAACGAGA 


1500 


45 


GATCCTCCGA 


CGTAAGTGTT 


TTCATCCTGC 


CTCTGCCTCA 


ACCTGAAGTG 


ACCTTTGCCC 


1560 




TGTCACCCCA 


TTGGCTGCCT 


CAGTTTCCCT 


TTCATCGACA AGGCCTTGTG AGCACTTGGC 


1620 


50 


AGATATGAGG 


AAGGTGGCAA 


GTAGATTTGG 


CCTTGGTGGT 


TGCTGTACAA 


TGGATTGGCT 


1680 




TCTGTCATGT 


TCTTCAGTCA 


CAGCCCCCTT 


GCTACCCAGC 


CAGTTGCTCT 


GAGGAGCCTG 


1740 




TCAGTGTGAT 


TGAGCTCACC 


CACTTGACAT 


CAAATACAGG 


AGTTCAGGAT 


GCAGAGTGTT 


1800 


55 


GCTTCATCTC 


TGAAGGCCAG 


TGAGCCAAAG GGGAAAAAAT AATAATTTTC 


TTAAAACTAT 


1860 




AGCTGGCTAT 


GTTTGAGCTC 


CTTCAAAGAA 


AGGAAAAGGG 


TGGCTTTGCT 


GGAGCAACTG 


1920 


60 


AGGTGGGCAG 


TAAGGGCCTG 


TGCTGAGGGC 


TCCCCATCTC 


CAGCTCCACA 


TGCAGTGAGA 


1980 




GAAGGTTGCA 


AAGCTTAGTT 


AGACGAGGGG 


AATAAACCTG 


TCTTCGTCCG 


TTGTCTGTCT 


2040 
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GTCTGTCTGT CTGTCTGCTG AGTGAAGGCT ACAGACCCTA TCAAATCTAC TCCTTTCTCT 2100 

TTTCAGAATT CAACCAGACA GTCCAGAGTT CTGGAAATAT GACAGACAAA AGCAGTCAGG 2160 

5 

ATCAGCTGCT GTTTCTCTTT CCAGAGTTCA GTCAACAGAG CCATGGGCCT GGGCAGTCCG 2220 

ATGATGCCTG CTCTGAGCCC ACCAACAAGA AGATGCGCCG CAACCGGTTC AAATGGGGGC 2280 

10 CCGCGTCCCA GCAAATCTTG TACCAGGCCT ACGATCGGCA AAAGAACCCC AGCAAGGAAG 2340 

AGAGAGAGGC CTTAGTGGAG GAATGCAACA GGTAACACCA CCAGAAGCTC AGGTGGGCAG 2400 

GTGGGCAAGT ACACAGACCC AGGAACCCTC CCCTCGGTCC TGGGATATTG AGACACTAGT 2460 

15 

TATACAGATA AGTGTGGCTA AATCAGAGCT TCTCAAAGTA TGTTCCACAG TGATTGTGTG 2520 

TTTTGGGCCA AGCACCAACA AGTCCCCCCG CCCCCCTTCA CTCACCATCT CCCCTCCATC 2580 

20 CATTCCCAGG GCAGAATGTT TGCAGCGAGG GGTGTCCCCC TCCAAAGCCC ACGGCCTGGG 2640 

3 CTCCAACTTG GTCACTGAGG TCCGTGTCTA CAACTGGTTT GCAAACCGCA GGAAGGAGGA 2700 

GGCATTCCGG CAAAAGCTGG CCATGGACGC CTATAGCTCC AACCAGACTC ACAGCCTGAA 2760 

CCCTCTGCTC TCCCACGGCT CCCCCCACCA CCAGCCCAGC TCCTCTCCTC CAAACAAGCT 2820 

i 5 - 8 * GTCAGGTAAG CAAAGGTTGG GCCTCACTGC CTCGGCAACC CAACCATCCT GGTTCTTGCC 2880 

rf|0 ACGGATCTTA TCTGGTTTAA GGGTTTTCAG AGGAGCAAAC GCTTTTGAGA TGATCCTAGG 2940 

GCCGCTCTCT CATTGCCAGA ATATACTCCC CTGGAAATAA TGTGTGGCTC TGATCAGTTC 3000 

M< CAAGGCACTG GGGATACATC AGTGAACAAA ACAAACGAGA TAAAAATTTC CTGCCCTCGT 3060 

m 

\d GGCGCTTACA TTCTAGAATT AAATAGAGAA CATGCCATAT TTACCCTGGA GAAAAGCAGC 312 0 

CGATATTTCT TGTGGGTGGA CAGGGGAGGA GAAAGCAACT TTATTTTCTT ATTACCCACC 3180 

40 CTTGAAAACA AGAGGTGCCG AGTCATTGTT CCAGGACCCT GGTGGCACTA ATGTTCCCTA 3240 

CTGGGTTTGT GTTGTTTTGC AGGAGTGCGC TACAGCCAGC AGGGAAACAA TGAGATCACT -3300 

TCCTCCTCAA CAATCAGTCA CCATGGCAAC AGCGCCATGG TGACCAGCCA GTCGGTTTTA 3360 

45 

CAGCAAGTCT CCCCAGCCAG CCTGGACCCA GGCCACAATC TCCTCTCACC TGATGGTAAA 3420 

ATGGTGAGTA CACCTGGGCC ATTGTCGCTC TGGAGCTGAT AAGATAAGAG GCAAAACAAA 3480 

50 CACAACTTCT CACAAGGCCT GCCTCAAACA ATGAACCATT GTAGCCCCAT AGGGGAAAAT 3540 

GAGGGCTGTC CAGAGTCGGA AAGGAGAGGT AGTGCTGGTG ACCCACCCTT TGGCGGGTAG 3600 

AAAACCCAAA GTGATGGGAT TACAGGGGTG AAGCACCATG CCCAGCCAAT AATTGTTATT 3660 

55 

GAGTGAATGA AGGAATGAAT TTGAGAACTA GTCATGCCAA GGAATCGCTA AGTCACATCG 3720 

TGTTGGAAAC TGCTCTTTGT GGTCCAAGTC CACCCATGTT TCTCTTGTTT TTTTCTCTCC 3780 

60 ATCAGATCTC AGTCTCAGGA GGAGGTTTGC CCCCAGTCAG CACCTTGACG AATATCCACA 3840 



A: 12076 1 (2L6H0M.DOC) 



-268- 



10 



GCCTCTCCCA CCATAATCCC CAGCAATCTC AAAACCTCAT CATGACACCC CTCTCTGGAG 3900 

TCATGGCAAT TGCACAAAGT AAGTTCTATT CTTGGTTGGA AAACCTGGGG GCAGGGAGAA 3960 

GAAGAATGGG AAGCAAATTA ATGTGGTGAA AAATAACTGT AGGTCTCCTT CAAACTCACC 4020 

CACAACTAGT AAATTTGGTT TAACTTCTTT AGTTTCTCAT CTGTCTCCTT AAATCCAATA 4080 

TTTGGATTGT TTAGCCTAAA ACAAGAAAAA ATTGTGGAAT GGATTTGGAT CCTGGTCACA 4140 

GTTTAGCAGC TGTGCATCCT GGGTCAAATC ATTGAACCTA TGACTCTGGG AGACTCTCAG 4200 

GCTTTAATCA GATCTGTTTA ATGCCCATCT CCAACCCACA ACTCATTGTG GAACTTGAGC 4260 

15 AAGTAAATTA ATATCTCCAA GTCTCCGTTT CTTTACACTT GCCTCCCATG GAATCTCCTA 4320 

TGTAACAGGC TCAGCCCGGT GACTGGGACA TTGAGCGGGG GCTCAAATGA TGGCATCCAT 43 80 

CCACCTCTCC TTATCCCAGG AGCTGTCTGT GTCTTTTCCT CTTGCTCCCA CAGGCCTCAA 4440 

20 

CACCTCCCAA GCACAGAGTG TCCCTGTCAT CAACAGTGTG GCCGGCAGCC TGGCAGCCCT 4500 

n GCAGCCCGTC CAGTTCTCCC AGCAGCTGCA CAGCCCTCAC CAGCAGCCCC TCATGCAGCA 4560 

^5 GAGCCCAGGC AGCCACATGG CCCAGCAGCC CTTCATGGCA GCTGTGACTC AGCTGCAGAA 4620 

¥■ CTCACACAGT AAGGACACGG GCATGTGGAG GGAGGGAGCA CTCAGGACCC TCAGTGGCCA 4680 

ACCACTTTCC CTCTCTGGGT CTGAACTTTC TCGGAAGTTT ATTGGCTTGG TCACTTTTCC 4740 

CTGCCTATGA TCAACCGACT AAGACAATTT CTCAAGCATA ACTCTTGAGT GTTGCTGTAC 4800 

CTTTTCTAGT CCTCTTCTCT ACCCCTGAGA TTCCCAGGGA AGGGTTTGAA TGACCTTTGC 4860 

j;35 TCCCGTTCCG TACCGGAGGC CTCCCTGGTA GGAAATGTGT TCTGAGAGCA GGTGGTTTCT 4920 

CCCTCACAGC CAAGCATCCA CATGCTTTCG GGAGTTGGTT ATGTGACTTG GAATTTACAT 4980 

l? * GAATCTTATG GATAACTAAT ATGAGAAATC CCCACTATAA CCACCAGCCC TTTTATCTAC 5040 

40 

CTGAGGAGAT GGGAGCTATG GTGTGGGATG GGGGCTCTGT ACCTGTGTCT TTGCCTGTGT 5100 

ATGCACCTTG ATTCTGTCTT CACTCTGTCT CTCCAGTGTA CGCACACAAG CAGGAACCCC 5160 

45 CCCAGTATTC CCACACCTCC CGGTTTCCAT CTGCAATGGT GGTCACAGAT ACCAGCAGCA 5220 

TCAGTACACT CACCAACATG TCTTCAAGTA AACAGGTAAT GCCAGCAGGA TATGCGGGGG 5280 

^ TTGGGGTGTG GGCAGGGTGT GATAAGGCCA TGGATGTGCA AAGGTTGTGG CAAGCATGGA 5340 

CTCGGCCAGA AATTATATCC TCTTTGCTGG TTGAGTTGGG CATCATCTCC CTTAGAGAAG 5400 

CCAAACTAAT GGCCCATGAC CCTGCCAAAT GACACAGCTG AGCACCCTCT CTCCTCTCTC 5460 

55 TCTGCAGTGT CCTCTACAAG CCTGGTGATG CCCACACACC ACTTACTTCG TGCGCAACAA 5520 

CAAGGACCCT GTTTTCCACA CCATCACCCT CTGGGCAGCT GTCATGGAAA AGCCCAGTGA 5580 

CCTGACCAGC ACCTGCGAGA GGTCCCTGCT ACCTGACGGA CGTCCTGCTG GCACCTCAGA 5640 

CAATCCACTC TCAGGAGGCG CAGCCCGAAG CCCAGTTTCC CTTCTATGCA GTATTGCCAC 5700 
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AATGCCTCTC 


CCACGATGTC 


AAGbAL 1LL.1 


ptptptpptp. 




A P A APP.A APP 


5760 


ACCGAAGAGG 


AAGCAAGAAA 


GCCGi AC lbl 


/^rp A TPTTPTP 
C XriluX lulu 


A TP P TT P a T P 


PA APAAACTC5 


5820 


ATGCGAAAAC 


TTGAATCTGT 


T AC TG AAA 1 G 




P A P A TP TP PT 1 


A TTP A A PTP A 


58 80 


GCCAAACACA 


CTGTAAATAT 


CCACAGACTC 


CC lbb 


PPPP A TPPP A 
C C ii 1 l~ k- V- /i. 


PATPATPTTP. 
^-r\ X vj-rt. X \_ X iu 


J _? VJ 


AGATTTCTTT 


TAAAGAAGTA 


AATTTGTCCA 


A I UVjrL. ibl M 


APT AT A A APT 


APTP/PAATTA 

X VJ X jMjtI X in 


fiono 


AGTGCAATTT 


CCCCTCTGTG 


TCCTCTCCCC 


TCTbLCLlbl 


A TATA ATAPT 


A A APTPTPT A 
lul^-ln. 


DUOU 


TTAGTTTTCT 


TTGTAAAGGT 


CAGAGTCAAA 


ATTTCAAAAG 


m/"t A T^TPTPP 


PPTPTPPPPT 


DXZU 


CATGGAGAAA 


CATCCTAAGT 


GGGAAG i GAA 




PTPTPPPPrPPt 


VJJVJJ J. UUiiV.rl 


6180 


CTTATGGGGA 


CAGCATACCT 


TGGACTGACT 


ACCAGCTAAC 


TCCAGTCTCC 


TGACATTAAG 


6240 


ACACACCTCT 


GGATCCCTGG 


AGGGGCTGAA 


TGTAGTGTGT 


CAGAGTAACA 


TGCCAGCTTC 


6300 


CTGTGGGCCA 


GGAGCTCAGC 


CTGCACTCCC 


TAAGAAACCC 


CAGGGCAGGG 


AAACTGGCTG 


6360 


TTTGATAGCA 


GAAGAAAAAG 


TTGCAGTCTC 


AAAAGCCTTC 


CATTAAAACA 


ATTTATTTTA 


6420 


TCACTAAAAA 


AAA 










6433 



(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 609 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Met Val Ser Lys Leu Thr Ser Leu Gin Gin Glu Leu Leu Ser Ala Leu 
15 10 15 

Leu Ser Ser Gly Val Thr Lys Glu Val Leu Val Gin Ala Leu Glu Glu - 
20 25 30 

Leu Leu Pro Ser Pro Asn Phe Gly Val Lys Leu Glu Thr Leu Pro Leu 
35 40 45 

Ser Pro Gly Ser Gly Ala Glu Pro Asp Thr Lys Pro Val Phe His Thr 
50 55 60 

Leu Thr Asn Gly His Ala Lys Gly Arg Leu Ser Gly Asp Glu Gly Ser 
65 70 75 80 

Glu Asp Gly Asp Asp Tyr Asp Thr Pro Pro He Leu Lys Glu Leu Gin 
85 90 95 

Ala Leu Asn Thr Glu Glu Ala Ala Glu Gin Arg Ala Glu Val Asp Arg 
100 105 110 

Met Leu Ser Glu Asp Pro Trp Arg Ala Ala Lys Met He Lys Gly Tyr 
115 120 125 
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Met Gin Gin His Asn He Pro Gin Arg Glu Val Val Asp Val Thr Gly 
130 135 140 

Leu Asn Gin Ser His Leu Ser Gin His Leu Asn Lys Gly Thr Pro Met 
145 150 155 160 

Lys Thr Gin Lys Arg Ala ,Ala Leu Tyr Thr Trp Tyr Val Arg Lys Gin 
165 170 175 

Arg Glu He Leu Arg Gin Phe Asn Gin Thr Val Gin Ser Ser Gly Asn 
180 185 190 

Met Thr Asp Lys Ser Ser Gin Asp Gin Leu Leu Phe Leu Phe Pro Glu 
195 200 205 

Phe Ser Gin Gin Ser His Gly Pro Gly Gin Ser Asp Asp Ala Cys Ser 
210 215 220 

Glu Pro Thr Asn Lys Lys Met Arg Arg Asn Arg Phe Lys Trp Gly Pro 
225 230 235 240 

Ala Ser Gin Gin He Leu Tyr Gin Ala Tyr Asp Arg Gin Lys Asn Pro 
245 250 255 

Ser Lys Glu Glu Arg Glu Ala Leu Val Glu Glu Cys Asn Arg Ala Glu 
260 265 270 

Cys Leu Gin Arg Gly Val Ser Pro Ser Lys Ala His Gly Leu Gly Ser 
275 280 285 

Asn Leu Val Thr Glu Val Arg Val Tyr Asn Trp Phe Ala Asn Arg Arg 
290 295 300 

Lys Glu Glu Ala Phe Arg Gin Lys Leu Ala Met Asp Ala Tyr Ser Ser 
305 310 315 320 

Asn Gin Thr His Ser Leu Asn Pro Leu Leu Ser His Gly Ser Pro His 
325 330 335 

His Gin Pro Ser Ser Ser Pro Pro Asn Lys Leu Ser Gly Gly Lys Gin 
340 345 350 

Arg Leu Gly Leu Thr Ala Ser Ala Thr Gin Pro Ser Trp Phe Leu Pro 
355 360 365 

Arg He Leu Ser Gly Leu Arg Val Phe Arg Gly Ala Asn Ala Phe Glu 
370 375 380 

Met He Leu Gly Pro Leu Ser His Cys Gin Asn He Leu Pro Trp Lys 
385 390 395 400 

Gly Val Arg Tyr Ser Gin Gin Gly Asn Asn Glu He Thr Ser Ser Ser 
405 410 415 

Thr He Ser His His Gly Asn Ser Ala Met Val Thr Ser Gin Ser Val 
420 425 430 

Leu Gin Gin Val Ser Pro Ala Ser Leu Asp Pro Gly His Asn Leu Leu 
435 440 445 
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Ser Pro Asp Gly Lys Met lie Ser Val Ser Gly Gly Gly Leu Pro Pro 
450 455 460 

Val Ser Thr Leu Thr Asn lie His Ser Leu Ser His His Asn Pro Gin 
465 470 475 480 

Gin Ser Gin Asn Leu He Met Thr Pro Leu Ser Gly Val Met Ala He 
485 490 495 

Ala Gin Ser Leu Asn Thr Ser Gin Ala Gin Ser Val Pro Val lie Asn 
500 505 510 

Ser Val Ala Gly Ser Leu Ala Ala Leu Gin Pro Val Gin Phe Ser Gin 
515 520 525 

Gin Leu His Ser Pro His Gin Gin Pro Leu Met Gin Gin Ser Pro Gly 
530 535 540 

Ser His Met Ala Gin Gin Pro Phe Met Ala Ala Val Thr Gin Leu Gin 
545 550 555 560 

Asn Ser His Met Tyr Ala His Lys Gin Glu Pro Pro Gin Tyr Ser His 
565 570 575 

Thr Ser Arg Phe Pro Ser Ala Met Val Val Thr Asp Thr Ser Ser He 
580 585 590 

Ser Thr Leu Thr Asn Met Ser Ser Ser Lys Gin Cys Pro Leu Gin Ala 
595 600 605 

Trp 



(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10014 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 
TGGGTTGCCT GTGACTGCAC TGGCGATACC CCCACAAAGC CCACTCTGAA GGTAGGAGAC 
GGGTGGAGAG AAACAGGGGG ATGGCAAGGG GGATACGAAA CAGGGAGAGG GAGGAGGGGG 
AAGAGGATGG ACGTCTACCA GGCCCCACTT GGTGCTTGAT TTATGCCATC TCATTTCCTT 
CTCAAACCAC CCTTTGAAGT TGATTGTACA TTTTACAGAA AAGGAAACTG AGGCTCGGAG 
AGGAGAATCA TTTACCCAAG GTCCCAGTTA GTAGACGGTA GGTGCCTGAA TGTAAATCCA 
GGTCTCTGCC TGCTCCGGGA GGGGGTGGGG GTGAGGGAAA CAGGAGAATG TGATGGGAAA 
ATCCGAGATG GAGCCAGCCT GGGCCAGAAA CACTGGGAGC TGTGGGAGAC GGAGAGGGGC 
AGGGTGGGAT CACAGGGAGC AGGAGCGGGG AATTGGAGGT GAATCTGGCC CTCCCAAACT 
TCCAGTCCAT TCTGCTCCCA GGGGAACCGG GAAACTGCGG GGGAACTGGA AGGGAGCTCC 
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CAGAACAAGG ATCCAGAAGA TTGGCATCTG GGGCCTGGGA TTTAGGTTTC TAAATCGTGG 600 

GCCATGGGGC AGCCTTATCT CTGCAAAAGC ATTGAGGGTA GAAGTCAATG ATTTGGGAAG 660 

5 

TTATTGAATT AGGGGATCTC GGAGGTAGGC TGTCAGTGCC TGATAGTATC AGTTAGAATG 720 

CCTGACTTGG GGTGACAATG GCTTGGAGGG GTGGGTGAGT CAAGGGTCAA ATGAGTGCCC 780 

10 GTGAGTCATG ATGCCTGCCT TGTACAATTG ATAAC-TGAAC ATCGGTGAGT TAGGGCCCCA 840 

GCAGTTGTAA TTAGCACCCC GGGTGTCAGC CAGAAACCAA CAAACAGCCA AATCCCTGCA 900 

GCCCCGCCCA GCCTATCCAC CGGCGGGGGA CCGATTAACC ATTAACCCCC ACCCCTCCCC 960 

15 

GGCAGAGCCT CCACCCCTTC ACAGAGGCTA GGCCAAGACT CCCAGCAGAT CTTCCCAGAG 1020 

GACGGTTTGA AAGGAAGGCA GAGAGGGCAC TGGGAGGAGG CAGTGGGAGG GCGGAGGGCG 1080 

20 GGGGCCTTCG GGGTGGGCGC CCAGGGTAGG GCAGGTGGCC GCGGCGTGGA GGCAGGGAGA 1140 

p. ATGCGACTCT CCAAAACCCT CGTCGACATG GACATGGCCG ACTACAGTGC TGCACTGGAC 1200 

CCAGCCTACA CCACCCTGGA ATTTGAGAAT GTGCAGGTGT TGACGATGGG CAATGGTAGG 1260 

2f 

hfl TGGGGGCAGA TGTGCCCAGG TGTGCCAGTG GGGGCAGGTG TGCCTGGGTC CAGGAGCAGA 1320 

it™ 

TCTTTGGCAC TCAACTTTGG GGTGGGAGGA GAATGATACA AAATGGTAGG TTGGTCCTAC 1380 

AGGCCAGCAC AGGTGTTGCC AAGTGAAGCC CATGTGCCCA GGCACAGTGA TCACAGGCAT 1440 

^ TCTGGGTGAA GGGAGGCCTG CAAGGGCCAA TTTCCAGCAA AAGTCGATCC CGGCTATTCC 1500 

TCCCAGGCCC TTCCAGTCCT CACTGCCTCA CAGTGGCTCT GCTTGGCGCT TGGCACAGTG 1560 

ACATGATGGT GAGCTCCCCC TTGGTGCCCA GCTCCAGCGA TTCAGCCCAG CACGGCCCCT 1620 

M TCGTGAACCC CTTGGGCCTA GGTTCAGAGA GACGGCAAGG GATGTTGTAT CCCTGGAGAT 1680 

40 GGTGGTTGGA GACATAACCG CATTTCTCGG TGTCTTTGGG ACTTTCCTAG GGAAATGAAA 1740 

TTGGCACTTA GGGAAAATGG AGCTCTCAGG GAAGTTTTGC TAACTACGAA GCCAACTCAG -1800 

CACTGTGTGT GTTGTGTGTG CGTTCGTGTG TGATAGTGAG TTTCCATGTA GGTTGTATGG 1860 

45 

GTGGGGTGAT GCCTTCAGGA ACCCATTTGC ATATGTGTGT TCATTTGTCT CTGTGTGTGA 1920 

GTTCTGGGTC TATTTTCCTT TGTATTCATT GAGTGGGTCT GTGTTTGTGT CTTAGGAGTT 1980 

50 GCCCGTGTTG ATCTTGCTTA TGTATGTAAG TGTGTATGTG TGTGTACTTG TGTCTGTGGA 2040 

TGTTTGTACA TGTGTGCTGT GTGTGCGGGT CATAGAGCAC ATGCGTTTGT GCATGCGGAC 2100 

CTGTTGGAGT GCCCTGTTCT TCCTGCATCT TTATCCTGTA TGGGCGTTTT GTCGTGTGCC 2160 

55 

CATATTTGTA CCTGCTGTGT ATATATGCAG TTCCCTGTGC TGCGGGCGGG GGTCAGCGGT 2220 

CTCTGGTGTG CACGACTGCA CAGACCCAAA TGCAGGACTC TGTTGTTGCC ACTCACCAAG 2280 

60 TGAGATTCAT ATCAGCAACA TGTCCGTTTG TCTCTGAGCA GATTTTGTTG CCGCTGCGTC 2340 
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TCGCCAGATT GAGGCATCCC CTCCGACATC ACTGGAGCAT ATCTGGAGGG GTGGACAGTT 2400 

CTCCACAGGG AGGTAGGGGA AAAGAGGAGG CCCGGAAACC CCTCCTGGAG GGAAGAGCCC 2460 

5 CATCGGTCCC AGGCCAGCCT CAGAGGAGAG GGGGCAGGCA GCTGGCTGAG GTCAGCCTGC 2520 

CACCCTGCTT CCTTCTGTGT CTTGGAGCCA CTCAGCCAGT ATGAGGCTGC AGCTCCAGCT 2580 

GAGGTCTGGA ATCTTGTGGT CAGCTCAGCT AGGGTGAGGA GGCAGCTGCT GGGCACTGCT 2640 

10 

TGTTGTCAGC TCAGCAGGTG CTCACCTGCC CCTGCCGTCC AGTCACGTGT GACCTTGGGC 2700 

ATGTCACCTC CCCTATCCTG GCTTCTGTAT CTTCTACAAA ACAGGCTTCA TTCCCCCAGG 2760 

15 CCTGCTGGCT GGACGGCTTT TAGGCCTGTC TGAGGACCAC GCCAGGAGCG CAAGGCAAAA 2820 

ACACACCAGA GATCCCCTTG CGAGTTAGGA GGCCGGCTCC CACCCCAGAA GGTGGCCAGG 2880 

TTTTCATGCC TTCCTAGAGA AAGCTGGGGC TGGTGGCCTC CACCACAGGG AGACGCAGAC 2940 

20 

CCTCAGAAAC AAGTCTGTGA AGTCACAACC AGCCCCAGTT TACAGATGTG AAACTGAAGC 3000 

,J TCCAAAAAGT CAGGAGGTCA CTGAGTGGGG AGGTGATGGA GTGGGAACAG CCCCCAGATC 3060 

2$1 TGGCTGAGGC CGAAGCCCTG GAGAGATCCC CGCAAGGCTC CCTTAGATGC CTGACATTCT 3120 

;|:; GCTCTTCCTG AAGCCTCACT CCCTTCTCTC CTGGCGCAGA CACGTCCCCA TCAGAAGGCA 3180 

Til CCAACCTCAA CGCGCCCAAC AGCCTGGGTG TCAGCGCCCT GTGTGCCATC TGCGGGGACC 3240 

GGGCCACGGG CAAACACTAC GGTGCCTCGA GCTGTGACGG CTGCAAGGGC TTCTTCCGGA 3300 

'rj GGAGCGTGCG GAAGAACCAC ATGTACTCCT GCAGGTGAGG AGCCTCAATT TCTTCAGCTG 33 60 

3£l| GGAAATGGGC ACACTTGGGC TCATGGCCCC AAGGTCTGTC TTCTCCCTGA GTGGGTAGGT 3420 

CCCAGAGACA GCTGCCCTTC AGGGCCTTCA AGGCTCTTCT GGTTTTGTAA AAGACTTTGT 3480 

GAATCCAAGA AGAGCATCTA TTCTAGGAAC CACATTTACT GATCATCAAG CTACTGGCTG 3540 

40 

CCGTTTATTG AGCTCTTATC ATATGCCAGG CACAATACTA AGTCTTTGTG TGTATTTACG 3600 

TACTCCAGAG GTCAAGGTTC CCAACTCAGC TCTAACACCA ACCAGCAGAG CGACCCAGGA 3660 

45 CCACATGTTG CCTCTCTGAG CCTCAGTTTT CCCATGTTTA GCAGGACAGG ACTGGGCTCT 3720 

TAGAGAGTTC ATAGCACCTT TCCAGCTCCT GGTGGGTTCA AGAGAGAACT CCCGGGATGA 3780 

AGAGATGAGA GCACTGAGGT TGGGGGGTCA ACTGGATAGC CAGGGCCCTA GTTCTGTCCT 3840 

50 

AAGAGGAGGA AGTTGTGTCT TCTCCATCCA ACCATCCAAA GCCCTCCCCA GATTTAGCCG 3900 

GCAGTGCGTG GTGGACAAAG ACAAGAGGAA CCAGTGCCGC TACTGCAGGC TCAAGAAATG 3960 

55 CTTCCGGGCT GGCATGAAGA AGGAAGGTGA GCCTCGGCCC TCCCCGCCCC ACCACCACTG 4020 

CCCCACCTGC ACCCACAGCT CCCCGACAGT CATTTACAAC TGTAGCCACA CTTTATGACT 4080 

CAGTGGCAGG CCCCAGGGTG ACTGGCTAAT GGCTGAGAAG AGGGAGGGCC TGGAAATCTG 4140 

60 

ACCATAGGGA GCGGCTGGGC TTGGTCTTGA GAAAGATTCT CCCACTCCTC ATCAGTCACA 4200 
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GACACCCCCA CCCCCTACTC CATCCCTGTT CTCCCTCCTC ACCTCTCTGT GCCTCCTCAC 4260 

CCGTCCAGAA TGAGCGGGAC CGGATCAGCA CTCGAAGGTC AAGCTATGAG GACAGCAGCC 43 20 

TGCCCTCCAT CAATGCGCTC CTGCAGGCGG AGGTCCTGTC CCGACAGGTA CCGGGGTGAT 43 80 

CCTGCCACCC ACCCAGGGAT CCCCCACACT ACAGAGGAGC TCACCTCCTC CACCTCCATT 4440 

10 CTCCCCAGCC AGGCCCTGGA GCAGCTGACG GGAGGGGCCT CAGATATTAC AGAAGGGACA 4500 

CTGAGTGCGG TTTCACATGG CCCAGTTTGC AGCAAGGGCA GGAATCGAAC CTGGCGCCCT 45 60 

GGGGCACTTT CTAATTCATC CTACTGCCTG CATCCCACAG GCCAAGCAGA GTCTTCACCT 4620 

15 

TCACTGAGGG CCTGCGATCA GCTCAGCTCC GAGAGAACAG AGCAGTGGCT CAGTGGAGAG 4680 

AGGTGGCAAA GTGGGGCCCA GCCCTTCCCT TGCTGAGTGA CCTTGGGCAA GTCACAGCAC 4740 

20 CTCTCTGAGC CATGGTTGCC TCATTGTCAG AAAAGGATGA TGATTTTTTG CCCTGCTTCT 4800 

13 CCTCTAAGGC TGACAGACTC CTTGGGGCTC TAAAGCTGTT CTCCCTCATC CCTGCCTCCT 4860 

CCCTCCCTCC GTTTTTACCC TGAGCTTCCT TCAGAGCTGG AGGGCACCCA CTATCCAGCC 4920 

% CCCTCCCCAC ATCTGATTCC AGGGAGGGGG CTCTGTGCAG GGGACAGAGA ATGCGGGAGG 4980 

;f GCCCGGACAT CTCCAGCATT TTCTTCCCTG TATCTCTCGA AGATCACCTC CCCCGTCTCC 5040 

$Q GGGATCAACG GCGACATTCG GGCGAAGAAG ATTGCCAGCA TCGCAGATGT GTGTGAGTCC 5100 

m ATGAAGGAGC AGCTGCTGGT TCTCGTTGAG TGGGCCAAGT ACATCCCAGC TTTCTGCGAG 5160 

CTCCCCCTGG ACGACCAGGT GAGGATGGGC GTGGATGGTG GGCAGTAGTG GGCAGTGGGC 5220 

15 

; y GGGGCAGCCA GGGGGCTGCT GGCCCACCTG GGATATAGCC GTGGACTGGC TTGATTTTAT 5280 

^ TTTATTTAAC AAAATATGTA GTGCACACAC GTGTCTGAAA CTTTAAATCA CCTTACAAAT 5340 

40 ATTAACTCAG TTAGCTCCTC CAACAACTCT ATGAGGTAGG TACTAAGGTA CTATTATTAC 5400 

TGCCATCTCA TAGGTGAGGA GATTGGGGCA CAGAGAGGTT AAGTAACCTG CTCAAGGTCA 5460 

CATAGCTACT ATCCAGCATA GCTGGGATTT TTACAAAGCA CCCTTCATAA TTCTCCATAG 5520 

45 

CTGGTCCATG GGTGGGAATT TGGGACCCAC AGTTTTGGAA CTTTTTGGGA TCATAGACCT 5580 

TTTTGAGAAT CTCAAAAAAG AAAAAAAAAG CACACAGAAT GTTGCTTACA GTTTCATCAG 5640 

50 GCACACAGAA GAGGCCCAGC ACGAAGCAGT TTCTTGCCCA AGGACACAGC AGTTCAAGGA 5700 

CAGAGTCAGC GCGAGGTCTC TCAGCTCTGA GCACATGTTC TTTCCCCTTC CAGGTTTCTA 5760 

GTTTTATGGG TAGTAGTTTT ATGATGCCCA TTTCACAGTT CAGGCAGGTA GAGGCAGAGG 5820 

GGAGCATTAA GCTGACTTGC CCAGCGTCAC TGAGTTGGCT ACGGGCAGCC TTCCCAAGGG 5880 

TACAGATGGC AAACACTGTT CCTTCTCTCT TTCAGGTGGC CCTGCTCAGA GCCCATGCTG 5940 

60 GCGAGCACCT GCTGCTCGGA GCCACCAAGA GATCCATGGT GTTCAAGGAC GTGCTGCTCC 6000 



A: 120761(2L6H0l!.DOC) 



-275- 



TAGGTGAGGC GGCTGCCTGC CCTGGCCAGG GCTCCAGGGA GGGTATGCCT AGCATGGCAC 6060 



TCACCCAGGC AAGGAGATTC ACATGGTGGC ATGCAAGGGT GAGGGAGACT AGTCAGGAGT 6120 
5 • GGCCCTGTCC TCAGGCTTGC ATTGGAGGGC TCCAGGACTC AGTTTTCAAC TGGGTACCCC 6180 



ACTCAGATGC AAGGAAATGT GGATGCAAGT CACCAAATTC CCAGCATTGA AGTCAGAGCA 6240 



CGATCAGGGT TATCCCTGGA ATTACCTGTG CATCCTTTTT TCTTTTGACA GAGTCTTGCT 63 00 

10 

CTGTCACTCA GGCTGGAGTG CAATGATGTG AGCAAACACT ACCTATTTTA ATATAACAAT 63 60 



GCTATGAGGG AGCTCGATTA TTTATCCTCA TCTTATAGAT AAGAAAACTG AGGCACAGAG 6420 
15 AGGTTAAGTA ACTTATCCAA CTATAACCAG CTATCAGGGG CAGAGCCATT TAAGCAGGGC 6480 



AGTGCAGTTC CAGAATCTGG TCCTTTAACC TTGATGCTTT GGTGCCTATC AGGTGACCTT 6540 



TGAATGTCAT CGATCTTGTG AGTCATGTTG GTAAATGGAG CTTGGGTCAT GTGAAAGAGG 6600 

20 

TCCTAGAAAG CCAAGTTCCA AGCTCAGCCG GATGACTCAA GGCAGCTTAT CTTCTGAATC 6660 

^0 TGGGCCTCAG CTTCCTTACC TGTGAAATGG GAGTCACCAT CCCTGCAGGT CCTCCTCCCA 6720 

2j| CAGGCACCAG CTATCTTGCC AACTTAAAAG CCAAAACTAG AGGAGAGGGG TCAACCCAAG 6780 

\'\ GTGACTTCCC ATCCTCCCTC CCTCCCAACC CTTCCAGGCA ATGACTACAT TGTCCCTCGG 6840 



U CACTGCCCGG AGCTGGCGGA GATGAGCCGG GTGTCCATAC GCATCCTTGA CGAGCTGGTG 6900 

30 

IS CTGCCCTTCC AGGAGCTGCA GATCGATGAC AATGAGTATG CCTACCTCAA AGCCATCATC 69 60 



w TTCTTTGACC CAGGTACAGT GCACACCTCC TAAGCCATCC CTGACTCTCT CTCCAGAACG 7020 

SI CTCTGCCAGA CTTCTCCTAT TGGGTTCTGT ACACTGAGTT CACAGCCTCA TCTCATGTTA 7080 

Ul ■ 

p ACGACAGCCA GGAGAGGCCG TTTTCATTTA ACAGATGAGG CAAGTCAAGA TTTGAAGAGA 7140 

CAATATGGCC GGGCGCAGTG GCTCACACCT GTAATCCCAT CACTTTGGGA GGCTGAGGCG 7200 

40 

GGCGGATCAC CTGAGGTCAG GGGTCAAGAT GAGCCTGGCT AACATGGAGA AACCCCATCT 7260 



CTACTTAAAA GTGGCTCTGC CAACAACTGG CTGTGCGACC CAGGACAAGT CCTATCTTTG 7320 
45 CACTGTGTCT GGGTTTCCCC GTGTGTAAGA TGAGGCGGTT GCTAGGTGCT TATTGGATGC 7380 



ATTCCTCAAG TCCCGCCCTC CATCTCCTAT TCCCCTCTCT TCTGGTTTAG TGCTTTAGGA 7440 



AATGTGGCAG AAATCTTTTT CTGCCTGTGT CTAGGAAATC ATAATTCATG CTGGCGTACC 7500 

50 

CTGGTTGTTG AGGTCCCTGA ATCCTTGTGC CCACACTGCT GAAGACTCCT TGTGTGACAC 7560 



AAGTCAGGGG ACATCTGGGT CTTGACTCCC CAGATGCTCC AGCTGGACCC TGCTGCCCTC 7620 
55 CCTTGCCCAC CCTCTTCCAT TGTAGATGCC AAGGGGCTGA GCGATCCAGG GAAGATCAAG 7680 



CGGCTGCGTT CCCAGGTGCA GGTGAGCTTG GAGGACTACA TCAACGACCG CCAGTATGAC 7740 



TCGCGTGGCC GCTTTGGAGA GCTGCTGCTG CTGCTGCCCA CCTTGCAGAG CATCACCTGG 7800 

60 

CAGATGATCG AGCAGATCCA GTTCATCAAG CTCTTCGGCA TGGCCAAGAT TGACAACCTG 7860 
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TTGCAGGAGA TGCTGCTGGG AGGTCCGTGC CAAGCCCAGG AGGGGCGGGG TTGGAGTGGG 7920 



GACTCCCCAG GAGACAGGCC TCACACAGTG AGCTCACCCC TCAGCTCCTT GGCTTCCCCA 7980 

5 

CTGTGCCGCT TTGGGCAAGT TGCTTAACCT GTCTGTGCCT CAGTTTCCTC ACCAGAAAAA 8040 



TGGGAACAAG GCAATGGTCT ATTTGTTCAG GCACCGAGAA CCTAGCACGT GCCAGTCACT 8100 



10 GTTCTAAGTG CTGGCAATTC AGCAAAGAAC AAGATCTTTG CCCTCGGGGA GGCTGTGTGT 8160 



GTGTGAGTAT GTATGGATGC GTGGATATCT GTGTATATGC CCGTATGTGC GTGCATGTGT 8220 



ATATAAAGCC TCACATTTTA TGATTTTGAA ATAAACAGGT AATATGAGGG ACACATAGAT 8280 

15 

GCTATAAGTA GGTCAGTTGG CTGCAGCAGA GATGTGGGGG ATGAGGCTGA AAGGTGAGGC 83 40 



GGGACCAAAT GGTTGAAGGA CTTGCACTCC AAGGAGCTTT GAGAGCCATT GATTACATCC 8400 



20 ATTATGTTAC TATGTGACCA ATACATTACT CATTAGAACA TTTACGTGAT CTCAGAGCTT 8460 



CCTTATATGC ACCTTGTTCC TTTCAACTCA CTTTTGTTCT CTTGGTTTTT TGGGGTCCTC 852 0 



f i.s TTAACACCCT CATGAAGTCT ATAGATGGGA ATGGTACACC CTAGTTTACT AACCCAGGAA 85 80 

TAGGTACCCA ACAGGCACTG CCAATATTGG ATGGGCTGGT TGATTGGCCA CGCCTGAGGA 8640 



I*** AGATGGCGTC CCAAGGCCTG AGGTCTGCAT CCCAGACTCT CCATCCTGAT CGACCTTCTC 8700 

ipO TACCTGCAGG GTCCCCCAGC GATGCACCCC ATGCCCACCA CCCCCTGCAC CCTCACCTGA 8760 

!U TGCAGGAACA TATGGGAACC AACGTCATCG TTGCCAACAC AATGCCCACT CACCTCAGCA 8820 

S'f ACGGACAGAT GTGTGAGTGG CCCCGACCCA GGGGACAGGC AGGTGGGCAA ACTCTGGGAT 8880 

m 

U.J TTTACCTTGC AAAGGGTGAG GATGGGGCTT AAGACAGGAG GCAGGAGAAA GTGGAGTCTA 8940 



GAAGGTAGAA CCAGGATGCA ACAGTTTTCT GGGTTCCAGG GTAGGGAATA AAGGGCAAGA 9000 



40 TTGTCCATTT GTTGAGGCTG TTTATTCAGT AAGGTGACTG ACAGCCTTTA CTGAATGAAG 9060 



CCATTGTTGG GATGAGGCAA TCCACTGGAT GAGGTAACCC ATTGGGTGAA GATGTCTTGG -9120 



GTGAGAATTC CATTAGTTGA CATTGTCCAT TAAGTAAAAG TGGTCATTGA AGTAAGGCTG 9180 

45 

CACAGTTGGG TAAGGCTATC CATTAGACAT TAGATGAGAC TACCCATTGG GTCAGGATGT 9240 



CTGCTGGGCT ATTTGGGAGA AGCAGTCCAA GTCTGCATAT CAAATAAATG ATGGAGGAGA 9300 
50 TGGGTGGTAG GACCTTCCAG ACCTCATAAA ACTTAGGCTT TATGATCTGG GACTCACAGA 93 60 



AGGTTGAGCA ATAAAAGACC TTAGGGATTA TCTGGCTTAA TTAATTCTCT CATTTTATAG 942 0 



AGGAAGAAAT TAAGTCAAGG TGGGGCAGGG TGGGAGGGGA GAACTTTCCC GGGGCTCTTC 9480 

55 

ATTTACTCCC ACAAAGGCTG GAATTTTGAG CAGCCCCTGT CTGTCTGTTT GTCCTTCCCC 9540 



ACCCCTGAGA CCCCACAGCC CTCACCGCCA GGTGGCTCAG GGTCTGAGCC CTATAAGCTC 9600 



60 CTGCCGGGAG CCGTCGCCAC AATCGTCAAG CCCCTCTCTG CCATCCCCCA GCCGACCATC 9660 
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ACCAAGCAGG AAGTTATCTA GCAAGCCGCT GGGGCTTGGG GGCTCCACTG GCTCCCCCCA 9720 
GCCCCCTAAG AGAGCACCTG GTGATCACGT GGTCACGGCA AAGGAAGACG TGATGCCAGG 97 80 
ACCAGTCCCA GAGCAGGAAT GGGAAGGATG AAGGGCCCGA GAACATGGCC TAAGGCACAT 9840 
CCCACTGCAC CCTGACGCCC TGCTCTGATA ACAAGACTTT GACTTGGGGA GACCCTCTAC 9900 
TGCCTTGGAC AACTTTCTCA TGTTGAAGCC ACTGCCTTCA CCTTCACCTT CATCCATGTC 99 60 

CAACCCCCGA CTTCATCCCA AAGGACAGCC GCCTGGAGAT GACTTGAGCC TTAC 10014 

(2) INFORMATION FOR SEQ ID NO: 131: 

<i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 567 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:131: 

Met Arg Leu Ser Lys Thr Leu Val Asp Met Asp Met Ala Asp Tyr Ser 
15 10 15 

Ala Ala Leu Asp Pro Ala Tyr Thr Thr Leu Glu Phe Glu Asn Val Gin 
20 25 30 

Val Leu Thr Met Gly Asn Gly Pro Ser Ser Pro His Cys Leu Thr Val 
35 40 45 

Ala Leu Leu Gly Ala Trp His Ser Asp Met Met He Leu Leu Pro Leu 
50 55 60 

Arg Leu Ala Arg Leu Arg His Pro Leu Arg His His Trp Ser He Ser 
65 70 75 80 

Gly Gly Val Asp Ser Ser Pro Gin Gly Asp Thr Ser Pro Ser Glu Gly 
85 90 95 

Thr Asn Leu Asn Ala Pro Asn Ser Leu Gly Val Ser Ala Leu Cys Ala - 
100 105 110 

He Cys Gly Asp Arg Ala Thr Gly Lys His Tyr Gly Ala Ser Ser Cys 
115 120 125 

Asp Gly Cys Lys Gly Phe Phe Arg Arg Ser Val Arg Lys Asn His Met 
130 135 140 

Tyr Ser Cys Arg Phe Ser Arg Gin Cys Val Val Asp Lys Asp Lys Arg 
145 150 155 160 

Asn Gin Cys Arg Tyr Cys Arg Leu Lys Lys Cys Phe Arg Ala Gly Met 
165 170 175 

Lys Lys Glu Ala Val Gin Asn Glu Arg Asp Arg He Ser Thr Arg Arg 
180 185 190 

Ser Ser Tyr Glu Asp Ser Ser Leu Phe Ser He Asn Ala Leu Leu Gin 
195 200 205 
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Ala Glu Val Leu Ser Arg Gin lie Thr Ser Pro Val Ser Gly lie Asn 
210 215 220 

Gly Asp He Arg Ala Lys Lys He Ala Ser He Ala Asp Val Cys Glu 
225 230 235 240 

Ser Met Lys Glu Gin Leu Leu Val Leu Val Glu Trp Ala Lys Tyr He 
245 250 255 

Pro Ala Phe Cys Glu Leu Pro Leu Asp Asp Gin Val Ala Leu Leu Arg 
260 265 270 

Ala His Ala Gly Glu His Leu Leu Leu Gly Ala Thr Lys Arg Ser Met 
275 280 285 

Val Phe Lys Asp Val Leu Leu Leu Gly Asn Asp Tyr He Val Pro Arg 
290 295 300 

His Cys Pro Glu Leu Ala Glu Met Ser Arg Val Ser He Arg He Leu 
305 310 315 320 

Asp Glu Leu Val Leu Pro Phe Gin Glu Leu Gin He Asp Asp Asn Glu 
325 330 335 

Tyr Ala Tyr Leu Lys Ala He He Phe Phe Asp Pro Asp Ala Lys Gly 
340 345 350 

Leu Ser Asp Pro Gly Lys He Lys Arg Leu Arg Ser Gin Val Gin Val 
355 360 365 

Ser Leu Glu Asp Tyr He Asn Asp Arg Gin Tyr Asp Ser Arg Gly Arg 
370 375 380 

Phe Gly Glu Leu Leu Leu Leu Leu Pro Thr Leu Glu Ser He Thr Trp 
385 390 395 400 

Gin Met lie Glu Gin He Gin Phe He Lys Leu Phe Gly Met Ala Lys 
405 410 415 

He Asp Asn Leu Leu Gin Glu Met Leu Leu Gly Gly Gly Pro Cys Gin 
420 425 430 

Ala Gin Glu Gly Arg Gly Trp Ser Gly Asp Ser Pro Gly Asp Arg Pro 
435 440 445 

His Thr Val Ser Ser Pro Leu Ser Ser Leu Ala Ser Pro Leu Cys Arg 
450 455 460 

Phe Gly Gin Val Ala Gly Ser Pro Ser Asp Ala Pro His Ala His His 
465 470 475 480 

Pro Leu His Pro His Leu Met Gin Glu His Met Gly Thr Asn Val He 
485 490 495 

Val Ala Asn Thr Met Pro Thr His Leu Ser Asn Gly Gin Met Cys Glu 
500 505 510 

Trp Pro Arg Pro Arg Gly Gin Ala Ala Thr Pro Glu Thr Pro Gin Pro 
515 520 525 
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Ser Pro Pro Gly Gly Ser Gly Ser Glu Pro Tyr Lys Leu Leu Pro Gly 
530 535 540 

Ala Val Ala Thr lie Val Lys Pro Leu Ser Ala lie Pro Gin Pro Thr 
5 545 550 555 560 

lie Thr Lys Gin Glu Val lie 
565 

10 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:132: 

20 

AAGTAAGCCT TGTTTTTCCA CACTCATTCT CCCAGGTTTT CTTTGGATAG GCTTACTTTT 60 

CCATGCTGGA GGAGGGGCTA TCCCTTCATT TTGCCTCTCC CGCTTCCCTC CCTCTCCCCC 120 

l; -|5 TCCCCCTGCT TTCTCTCCCT CTGCACTTTG TGAACTGCTG CTGCAGTGCT GAAGTCCAAA 180 

GTTCAGTAAC TTGCTAAGCA CACAGATAAA TATGAACCTT GGAGAATTTA CCAATGTAAA 240 

v ™ CAGATAGCCA AGGGTCCCTT TATCAGCACT GGCTCAGGAC AGTCGTGGGG GGTCTGAAGT 300 
j 40 

CP GGCTCAATTT TGTATTTTGT TTTTTTTGGG GGGGTGTAAA GGCGGGAGGC TGCGCTGTGC 360 

O CCGCTGCTGA CAGTCGGGCG TGTTACCTCG GGAACATGGT GTAGGGAAGC TGGAAGCAGG 420 

!;35 ATAACGTGGA ACTCAACCCA AGAAACGCCA GCCTGAAGAC CATGGTCTCG 470 



□ (2) INFORMATION FOR SEQ ID NO: 133: 

'40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 467 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 
TCACAGCTAT TAGCTCATCG CTGCCAAATT GCCCCTTTAC CTAGGCTTGT GTCACTTTCA 60 
50 CCTTCTCATT CTCTTACTTT TACATTCTTC CTTGATATTT TGCTTTTTCA ACTTTTGGAA 120 

ATTTCTTTCT CTCTTCTACC CCTCCTCATA TTCCTCTGCA CTCCCCCCTC TCTAACTCAT 180 
GCACTTTGTG GGGTCCAAAG TTCAGTAACT TGCAAAGCAC AGGGATAAAG ATGAACCTTG 240 

55 

GAAGATTTAC TCTGCTCTGA TGTAAACAGA GAGTGACAAG GGTCCCTTAT CTATGTCTCA 300 
GAGAAGCCTG TCCGGGGGGT GACCACTTGC TGGTTGTGGC TGCACAGTGT GTTTTTTTGG 360 
60 GGGGGAGGAG GAAACAGAAG GTGGGTAGAG CATGGACTCC CGCCCGCTGA TCCGTGTTAC 420 
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AGCCGCAGAT GGTGAGGCAG TAGAAGGCAA CAGACAGGAT GGCGTCT 



467 



(2) INFORMATION FOR SEQ ID NO: 13 4: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 479 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 4: 

TTTCGGGGGT GGGACCCAAC GCTGCTCTCC TGATGGCCTC CCTGGCTCCC AGCACCTTCC 60 

15 

ATCCCAGCTG CTCAGGGCCC CTCACCTGCG CCTCCCCCAC CCTCCCCTCT GCCCACTCCC 12 0 

ATCGCAGGCC ATAGCTCCCT GTCCCTCTCC GCTGCCATGA GGCCTGCACT TTGCAGGGCT 180 

20 GAAGTCCAAA GTTCAGTCCC TTCGCTAAGC ACACGGATAA ATATGAACCT TGGAGAATTT 240 

Q CCCCAGCTCC AATGTAAACA GAACAGGCAG GGGCCCTGAT TCACGGGCCG CTGGGGCCAG 3 00 

GGTTGGGGGT TGGGGGTGCC CACAGGGCTT GGCTAGTGGG GTTTTGGGGG GGCAGTGGGT 360 

GCAAGGAGTT TGGTTTGTGT CTGCCGGCCG GCAGGCAAAC GCAACCCACG CGGTGGGGGA 42 0 

GGCGGCTAGC GTGGTGGACC CGGGCCGCGT GGCCCTGTGG CAGCCGAGCC ATGGTTTCT 479 



:25 



40 



60 



(2) INFORMATION FOR SEQ ID NO: 135: 



(i) SEQUENCE CHARACTERISTICS: 
N b (A) LENGTH: 605 base pairs 

i!35 (B) TYPE: nucleic acid 

hi (C) STRANDEDNESS: single 

S (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 5: 

TGGGGCCTGG GATTTAGGTT TCTAAATCGT GGGCCATGGG GCAGCCTTAT CTCTGCAAAA 60 

GCATTGAGGG TAGAAGTCAA TGATTTGGGA AGTTATTGAA TTAGGGGATC TCGGAGGTAG 120 

45 GCTGTCAGTG CCTGATAGTA TCAGTTAGAA TGCCTGACTT GGGGTGACAA TGGCTTGGAG 180 

GGGTGGGTGA GTCAAGGGTC AAATGAGTGC CCGTGAGTCA TGATGCCTGC CTTGTACAAT 2 40 

TGATAACTGA ACATCGGTGA GTTAGGGCCC CAGCAGTTGT AATTAGCACC CCGGGTGTCA 300 

50 

GCCAGAAACC AACAAACAGC CAAATCCCTG CAGCCCCGCC CAGCCTATCC ACCGGCGGGG 360 

GACCGATTAA CCATTAACCC CCACCCCTCC CCGGCAGAGC CTCCACCCCT TCACAGAGGC 420 

55 TAGGCCAAGA CTCCCAGCAG ATCTTCCCAG AGGACGGTTT GAAAGGAAGG CAGAGAGGGC 480 

ACTGGGAGGA GGCAGTGGGA GGGCGGAGGG CGGGGGCCTT CGGGGTGGGC GCCCAGGGTA 540 

GGGCAGGTGG CCGCGGCGTG GAGGCAGGGA GAATGCGACT CTCCAAAACC CTCGTCGACG 600 

ACATG 605 
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(2) INFORMATION FOR SEQ ID NO: 13 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 478 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 6: 

TCCTGGAGAG TGGGACCCAG CGCCGCACCC AGAGGCCTCC TGGCTCCTGC TGCCTCTAGC 60 

15 CCTGCGCCCC TGGCCCCTCT CCACCTCCCC CACCCTCCCT TCTGCTCACT CCCAATTGCA 120 

GGCCATGACT CCGGTCCGCG TCCCTCTCAC CCCCATGAGG CCTGCACTTG CAAGGCTGAA 180 

GTCCAAAGTT CAGTCCCTTC GCTAAGCGCA CGGATAAATA TGAACCTTGG AGAATTTCCC 240 

CAGCTCCAAT GTAAACAGAG CAGGCAGGGG CCCTGATTCA CTGGCCGCTG GGGCCAGGGT 3 00 

TGGGGGCTGG GGGTGCCCAC AGAGCTTGAC TAGTGGGATT TGGGGGGGCA GTGGGTGCAG 360 

^1 CGAGCCCGGT CCGTTGACTG CCAGCCTGCC GGCAGGTAGA CACCGGCCGT GGGTGGGGGA 420 

■=p GGCGGCTAGC TCAGTGGCCT TGGGCCGCGT GGCTGGTGGC AGCGGAGCCA TGGTTTCT 478 

39* (2) INFORMATION FOR SEQ ID NO: 137: 

L (i) SEQUENCE CHARACTERISTICS: 

H (A) LENGTH: 622 base pairs 

M r (B) TYPE: nucleic acid 

35j <C) STRANDEDNESS : single 

y (D) TOPOLOGY: linear 

^ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

40 TGGGCTTGGG TGTTAGGTTT CCAGTTCAAG CGACCCAGGA CAGCTTTATC TCAAATTGAG 60 

GATAGAAGTC AATGATCTGG GACGTGATTG GCTTAGGGCT TCATAGTGGT AGGCTTGCCA - 120 

^ GTGTCTAAAC ATGTCAGCTG GGTTGTCCAC CTTGGTGAGA CTTGGGGGCT GCTGAGGCAA 180 

GGGGTCCAAC CAATGCCAGT CCTGTTGGGT GCCTGCCTTG GAAGATTGGT AAGTGACTAT 240 

TAATGAGCGG GAGGTGGGGG GGGGGCAACA GTTGTAATTA GCACCCCAGG TGTCAGTCAG 300 

50 AAACCAACAA ACAGCCAAAT CCTCGTGGCT CCACCCAGCC TACCCAGCAA CGGGGGTGAT 360 

TAACCATTAA CTCCTACCCC TCCCCACAGA GCCTCCACCC TCTGCAGAGG CTAGGCCAGG 420 

^ ACGCCAGGCT GAGTCTCCCA GAGGACAGTT TGAAAGAGAG GAAGGCAGAG AAGGGACCTG 480 

GGAGGAGGCA GGAGGAGGGC GGGGACGGGG GGGGCTGGGG CTCAGCCCAG GGGCTTGGGT 540 

GGCATCCTGG GCCGGGCAGG ACAGGGGGCT AAGGCGTGGG TAGGGGAGAA TGCGACTCTC 600 

60 TAAAACCCTT GCCGGCGATA TG 62 2 
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(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 470 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 8: 

TCTTGGGCAG TGGGACCAGC GCTGCTCCCA GAGGCCTCCT GGCTCCTGGT GCCTCTCTCC 60 

CTGCGCCCCT GGTTCCCGCT CCACCTCCCC CACCCGCCCT TCTGCTCACT CCCAATTGCA 120 

AGCCATGGCT CCCGGTCCGG TCCCTCTCGC TGCTGTGAGG CCTGCACTTG CAAGGCTGAA 18 0 

GTCCAAAGTT CAGTCCCTTC GCTAAGCACA CGGATAAATA TGAACCTTGG AGAATTTCCC 240 

20 CAGCTCCAAT GTAAACAGAG CAGCAGGGGG CCCTGATTCA CTAGCCGCTG GGGCCAGGGT 300 

O TGGGGGTTGG GGGTGCCCAC AGGGCTTGAC TAGTGGGATT TGGGGGAGCA GTGGGTGCAG 3 60 

CGAGCCTGGT CCGTTGACTG CCAGCAGTAG ACACCGGCCG TGTGTGGGGG AGGCGGCTAG 42 0 

CTCAGTGGCC TTGGGCCGCG TGGCCTGGCG GTAGAGGAGC CATGGTTTCT 470 



(2) INFORMATION FOR SEQ ID NO: 13 9: 

,j (i) SEQUENCE CHARACTERISTICS: 

;h (A) LENGTH: 557 amino acids 

f! (B) TYPE: amino acid 

(cj strandedness : 

$5 (D) TOPOLOGY: linear 

Q (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Met Val Ser Lys Leu Thr Ser Leu Gin Gin Glu Leu Leu Ser Ala Leu 
40 1 5 10 15 

Leu Ser Ser Gly Val Thr Lys Glu Val Leu Val Gin Ala Leu Glu Glu 
20 25 30 

45 Leu Leu Pro Ser Pro Asn Phe Gly Val Lys Leu Glu Thr Leu Pro Leu 

35 40 45 



50 



60 



Ser Pro Gly Ser Gly Ala Glu Pro Asp Thr Lys Pro Val Phe His Thr 

50 55 60 

Leu Thr Asn Gly His Ala Lys Gly Arg Leu Ser Gly Asp Glu Gly Ser 
65 70 75 80 



Glu Asp Gly Asp Asp Tyr Asp Thr Pro Pro lie Leu Lys Glu Leu Gin 
55 85 90 95 



Ala Leu Asn Thr Glu Glu Ala Ala Glu Gin Arg Ala Glu Val Asp Arg 
100 105 HO 

Met Leu Ser Glu Asp Pro Trp Arg Ala Ala Lys Met He Lys Gly Tyr 
115 120 125 



-283- 

A: 120761{2L6H01!.DOC) 



Met Gin Gin His Asn He Pro Gin Arg Glu Val Val Asp Val Thr Gly 
130 135 140 

Leu Asn Gin Ser His Leu Ser Gin His Leu Asn Lys Gly Thr Pro Met 
145 150 155 160 

Lys Thr Gin Lys Arg Ala Ala Leu Tyr Thr Trp Tyr Val Arg Lys Gin 
165 170 175 

Arg Glu He Leu Arg Gin Phe Asn Gin Thr Val Gin Ser Ser Gly Asn 
180 185 190 

Met Thr Asp Lys Ser Ser Gin Asp Gin Leu Leu Phe Leu Phe Pro Glu 
195 200 205 

Phe Ser Gin Gin Ser His Gly Pro Gly Gin Ser Asp Asp Ala Cys Ser 
210 215 220 

Glu Pro Thr Asn Lys Lys Met Arg Arg Asn Arg Phe Lys Trp Gly Pro 
225 230 235 240 

Ala Ser Gin Gin He Leu Tyr Gin Ala Tyr Asp Arg Gin Lys Asn Pro 
245 250 255 

Ser Lys Glu Glu Arg Glu Ala Leu Val Glu Glu Cys Asn Arg Ala Glu 
260 265 270 

Cys Leu Gin Arg Gly Val Ser Pro Ser Lys Ala His Gly Leu Gly Ser 
275 280 285 

Asn Leu Val Thr Glu Val Arg Val Tyr Asn Trp Phe Ala Asn Arcr Ara 
290 295 300 

Lys Glu Glu Ala Phe Arg Gin Lys Leu Ala Met Asp Ala Tyr Ser Ser 
305 310 315 320 

Asn Gin Thr His Ser Leu Asn Pro Leu Leu Ser His Gly Ser Pro His 
325 330 335 

His Gin Pro Ser Ser Ser Pro Pro Asn Lys Leu Ser Gly Val Arg Tyr 
340 345 350 

Ser Gin Gin Gly Asn Asn Glu He Thr Ser Ser Ser Thr He Ser His 
355 360 365 

His Gly Asn Ser Ala Met Val Thr Ser Gin Ser Val Leu Gin Gin Val 
370 375 380 

Ser Pro Ala Ser Leu Asp Pro Gly His Asn Leu Leu Ser Pro Asp Gly 
385 390 395 40 0 

Lys Met He Ser Val Ser Gly Gly Gly Leu Pro Pro Val Ser Thr Leu 
405 410 415 

Thr Asn He His Ser Leu Ser His His Asn Pro Gin Gin Ser Gin Asn 
420 425 430 

Leu He Met Thr Pro Leu Ser Gly Val Met Ala He Ala Gin Ser Leu 
435 440 445 
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Asn Thr Ser Gin Ala Gin Ser Val Pro Val He Asn Ser Val Ala Gly 
450 455 460 



Ser Leu Ala Ala Leu Gin Pro Val 
465 470 

Pro His Gin Gin Pro Leu Met Gin 
485 

Gin Gin Pro Phe Met Ala Ala Val 
500 

Tyr Ala His Lys Gin Glu Pro Pro 
515 520 

Pro Ser Ala Met Val Val Thr Asp 
530 535 

Asn Met Ser Ser Ser Lys Gin Cys 
545 550 



Gin Phe Ser Gin Gin Leu His Ser 
475 480 

Gin Ser Pro Gly Ser His Met Ala 
490 495 

Thr Gin Leu Gin Asn Ser His Met 
505 510 

Gin Tyr Ser His Thr Ser Arg Phe 
525 

Thr Ser Ser He Ser Thr Leu Thr 
540 

Pro Leu Gin Ala Trp 
555 



SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

Asp Met Ala Asp Tyr Ser Ala Ala Leu Asp Pro Ala Tyr Thr Thr 
5 10 15 



(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 516 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) 

Met 
1 

Leu Glu Phe Glu Asn Val Gin Val 
20 



Leu Thr Met Gly Asn Gly Pro Ser 
25 30 



Ser Pro His Cys Leu Thr Val Ala Leu Leu Gly Ala Trp His Ser Asp 
35 40 45 

Met Met lie Leu Leu Pro Leu Arg Leu Ala Arg Leu Arg His Pro Leu 
50 55 60 

Arg His His Trp Ser lie Ser Gly Gly Val Asp Ser Ser Pro Gin Gly 
65 70 75 80 

Asp Thr Ser Pro Ser Glu Gly Thr Asn Leu Asn Ala Pro Asn Ser Leu 
85 90 95 

Gly Val Ser Ala Leu Cys Ala He Cys Gly Asp Arg Ala Thr Gly Lys 
100 105 110 

His Tyr Gly Ala Ser Ser Cys Asp Gly Cys Lys Gly Phe Phe Arg Arg 
115 120 125 

Ser Val Arg Lys Asn His Met Tyr Ser Cys Arg Phe Ser Arg Gin Cys 
130 135 140 

Val Val Asp Lys Asp Lys Arg Asn Gin Cys Arg Tyr Cys Arg Leu Lys 
145 150 155 160 
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Lys Cys Phe Arg Ala Gly Met Lys Lys Glu Ala Val Gin Asn Glu Arg 
165 170 175 

Asp Arg lie Ser Thr Arg Arg Ser Ser Tyr Glu Asp Ser Ser Leu Phe 
180 185 190 

Ser lie Asn Ala Leu Leu Gin Ala Glu Val Leu Ser Arg Gin lie Thr 
195 200 205 

Ser Pro Val Ser Gly lie Asn Gly Asp lie Arg Ala Lys Lys lie Ala 
210 215 220 

Ser lie Ala Asp Val Cys Glu Ser Met Lys Glu Gin Leu Leu Val Leu 
225 230 235 240 

Val Glu Trp Ala Lys Tyr lie Pro Ala Phe Cys Glu Leu Pro Leu Asp 
245 250 255 

Asp Gin Val Ala Leu Leu Arg Ala His Ala Gly Glu His Leu Leu Leu 
260 265 270 

Gly Ala Thr Lys Arg Ser Met Val Phe Lys Asp Val Leu Leu Leu Gly 
275 280 285 

Asn Asp Tyr He Val Pro Arg His Cys Pro Glu Leu Ala Glu Met Ser 
290 295 300 

Arg Val Ser He Arg He Leu Asp Glu Leu Val Leu Pro Phe Gin Glu 
305 310 315 320 

Leu Gin He Asp Asp Asn Glu Tyr Ala Tyr Leu Lys Ala He He Phe 
325 330 335 

Phe Asp Pro Asp Ala Lys Gly Leu Ser Asp Pro Gly Lys He Lys Arg 
340 345 350 

Leu Arg Ser Gin Val Gin Val Ser Leu Glu Asp Tyr He Asn Asp Arg 
355 360 365 

Gin Tyr Asp Ser Arg Gly Arg Phe Gly Glu Leu Leu Leu Leu Leu Pro 
370 375 380 

Thr Leu Glu Ser He Thr Trp Gin Met He Glu Gin He Gin Phe He 
385 390 395 400 

Lys Leu Phe Gly Met Ala Lys He Asp Asn Leu Leu Gin Glu Met Leu 
405 410 415 

Leu Gly Gly Ser Pro Ser Asp Ala Pro His Ala His His Pro Leu His 
420 425 430 

Pro His Leu Met Gin Glu His Met Gly Thr Asn Val He Val Ala Asn 
435 440 445 

Thr Met Pro Thr His Leu Ser Asn Gly Gin Met Cys Glu Trp Pro Arg 
450 455 460 



Pro Arg Gly Gin Ala Ala Thr Pro Glu Thr Pro Gin Pro Ser Pro Pro 
465 470 475 480 
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Gly Gly Ser Gly Ser Glu Pro Tyr Lys Leu Leu Pro Gly Ala Val Ala 
485 490 495 



10 



20 



%5 



Thr lie Val Lys Pro Leu Ser Ala lie Pro Gin Pro Thr lie Thr Lys 
500 505 510 

Gin Glu Val He 
515 



(2) INFORMATION FOR SEQ ID NO: 141: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 
GCGGGACCGG ATCAGCA 

(2) INFORMATION FOR SEQ ID NO: 142: 



17 



^1 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
I s * (C) STRANDEDNESS: 
ijSP (D) TOPOLOGY: linear 

^' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

W Arg Asp Arg He Ser 

136 1 5 



(2) INFORMATION FOR SEQ ID NO: 143: 

•^0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



45 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 
GCGGGACTGG ATCAGCA 17 

(2) INFORMATION FOR SEQ ID NO: 144: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
55 (B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



A: (20761(21^01'. DOC) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Ala Glu Val Leu Ser Arg Gin 
1 5 



(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: 16 

(D) OTHER INFORMATION: /note= "N - C or T" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 
GCGGAGGTCC TGTCCNGACA GGTACCGGGG 



(2) INFORMATION FOR SEQ ID NO: 14 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE : 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 8 

(D) OTHER INFORMATION: /note= "N = C or T" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 
AAAGCAANGA GAGAT 



(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 3 

(D) OTHER INFORMATION: /note- "X = R or any amino acid" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

Lys Gin Xaa Glu 
1 
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